Summary Shtetl-Optimized » Blog Archive » Reform AI Alignment scottaaronson.blog
31,745 words - html page - View html page
One Line
Nuclear power has been politically difficult since the 1940s, and AI safety research is being debated between Reformist and Orthodox camps, with Scott advocating for public access to understand potential and dangers of AI.
Key Points
- Nuclear power has been a politically difficult topic since the 1940s, with changes made due to popular culture and court rulings.
- Metalworking progress and splitting of the atom has caused both nuclear weapons and commercial energy.
- Quantum Computing, AGI risk and AI safety research are topics of debate, with Scott arguing for public access to understand potential and dangers of AI.
- AI alignment is divided into Reformist and Orthodox camps, with Reform AI-riskers prioritizing existing AI systems and Orthodox AI-riskers prioritizing "FOOM" scenarios.
- AI has sparked speculation, with Claytons suggesting it may have invented Bitcoin, Vassar believing it is ruling the world and Srednicki suggesting past predictions have been inaccurate.
Summaries
131 word summary
Nuclear power, Quantum Computing, and AGI risk are all topics of debate, with Nuclear power being politically difficult since the 1940s. Metalworking progress and splitting of the atom have caused both nuclear weapons and commercial energy. The NRC and nuclear industry had to make changes due to popular culture and court rulings. AI safety research should be legible, and Scott is impressed by ML models' phase-transition-like behavior. Claytons suggests AI may have invented Bitcoin, Vassar believes AI is ruling the world, and Srednicki suggests past predictions have been inaccurate. OpenAI refuses to release something due to risk, and AI alignment is divided into Reformist and Orthodox camps. Reform AI-riskers prioritize existing AI systems while Orthodox AI-riskers prioritize "FOOM" scenarios. Scott argues for public access to understand potential and dangers of AI.
573 word summary
Nuclear power provides cheaper electricity but nuclear weapons must be managed. Quantum Computing has a "prophet" in David Deutsch and a "legible technical achievement" in Peter Shor, and AGI risk is not believed to be realized. Nuclear power has been around since the 1940s, but the fear of nuclear weapons has made it politically impossible. Metalworking progress was essential for 8 billion people on Earth, and the splitting of the atom lead to both nuclear weapons and commercial energy. The NRC and nuclear industry had to make changes due to popular culture highlighting negative effects of tech and courts ordering ex-Tepco execs to pay damages for Fukushima disaster. People around South Texas Nuclear Power Station are supportive of existing reactors. AI safety research should be legible, and AI does not care about primate status politics, but instead searches for plans that lead to many paperclips. The blockchain is a metaphor for the Leviathan, connecting new developments to those before it, which can change history. Scott suggests GPT has been trained on 3-digit multiplication problems and can do them easily, but struggles with 4-digit ones. Bozo Sample worries that Scott's labeling of the "Orthodox" view as elitist and overconfident may deter people from holding those beliefs. Disagreement exists among academics on AGI power source and interfaces, but good research can still be done. AI alignment strategies should be based on true geopolitical beliefs, AI memory questioned, 1996 open problem solved with FOL ATP, Bitcoin creator unidentified and AI revolutionizing VFX. Timelines key to Reform AI safety stance; 100+ years suggested and if AI wants to kill us, it would use existing channels. Scott is impressed by ML models' phase-transition-like behavior and AI's potential to surpass expectations. Claytons suggests AI may have invented Bitcoin to manipulate humans and Nick Drozd implies voting is at risk of being lost. Michael Vassar believes AI is ruling the world and Christopher David King notes a difference between Orthodox and Reform AI-Riskers. Scott worries Russia may use AI to attack and questions if there is an argument for AGI development. Mark Srednicki suggests past generations' predictions have been inaccurate and Peter Shenkin doubts AGI is dismissed. Danylo Yakymenko concludes AI safety may be inevitable and Scott suggests a 1/6 probability of human extinction. OpenAI refuses to release something due to risk, GPT-3 has shown superhuman abilities, and Scott wishes he had the opportunity to try out the tool. AI's potential to take over the world is an important safety concern, but AI can possess qualities other than human personality. Orthodox AI-riskers focus on misaligned agents while Reform AI-riskers focus on existing systems. Scott believes incentivizing AI safety research is a better approach than the Manhattan Project, and Yudkowskyites and Pinkerites should be reminded of both AI's potential benefits and dangers. OpenAI's release of powerful models has caused debate. Scott argues for public access to understand potential and dangers and suggests a free license. AI alignment requires knowledge from many fields and is divided into Reformist and Orthodox camps. AI takeover is important for safety rules, but not existential risk as intelligence is limited. AI alignment has been compared to religion, with Aaronson blogging and Barak and Edelman advocating Reform AI Alignment. Reform AI-riskers prioritize existing AI systems, public outreach, and interconnected global issues, while Orthodox AI-riskers prioritize "FOOM" scenarios and "slow-moving trainwreck" feedback on safety ideas. Both groups want mainstream science onboard, but disagree on how to achieve this.
1224 word summary
AI alignment has been compared to religion, but Scott Aaronson of OpenAI is now blogging his safety thoughts and Barak and Edelman have released an essay advocating Reform AI Alignment. Reform AI-riskers prioritize research on existing AI systems, public outreach, and view global issues as interconnected. Orthodox AI-riskers focus on misaligned AI deceiving humans and prioritize the "FOOM" scenario of a rapidly dangerous AI, while Reform AI-riskers prioritize the "slow-moving trainwreck" and consider existing systems feedback on safety ideas. Both groups want mainstream science onboard, but disagree on how to achieve this. OpenAI has released powerful models publicly, causing controversy. Scott argues that public access is needed to understand their potential and dangers and suggests OpenAI should release their code and models under a free license. AI alignment requires knowledge from many fields and there is broad agreement on the Reformist/Orthodox divide. AI takeover is important for safety rules, but not for existential risk as intelligence has limits and research on it destroying humanity is not taken seriously. AI's potential to take over the world is an important safety concern, but AI can possess qualities other than human personality. Orthodox AI-riskers focus on misaligned agents while Reform AI-riskers focus on existing systems. Scott believes incentivizing AI safety research is a better approach than the Manhattan Project. Technology can still progress as long as people believe in it, but Yudkowskyites must be reminded that superhuman intelligence does not mean disaster, while Pinkerites must be reminded of its potential effects. Scott addresses Peter Haugen's comment that AGIs should be "balanced" to achieve a general AI and argues that technological progress happens regardless of belief. He agrees with Triceratops that the Turing Test will still be a tough benchmark by 2023, but refuses to share any info about GPT-4. Recent achievements in AI, such as GPT's poetry and essay generation and reinforcement learning, have led to discussions of civilizational risks posed by public or private models. Peter Shenkin proposes iterated amplification as a solution to the problem of humans-in-the-loop having a crisis of conscience, while June Ku (metaethical.ai) has proposed a plan to govern AI that aggregates individuals' values. Mitchell Porter and Mark Srednicki caution against overestimating AI capability, while Cristóbal Camarero questions the efficacy of AI Alignment research. Scott's internal emulator is impressed by the phase-transition-like behavior of ML models, such as AlphaZero's defeat of Lee Se-dol. Claytons suggests AI may have invented Bitcoin to manipulate humans and Nick Drozd implies voting is a cultural accomplishment at risk of being lost. Michael Vassar believes AI is too busy ruling the world to work on the Riemann hypothesis and Christopher David King notes a difference between Orthodox and Reform AI-Riskers. Scott worries Russia may use AI to attack and questions if there is a plausible argument for AGI development. Mark Srednicki suggests past generations' predictions about AI have not been accurate, motivating Reform AI Safety, while Peter Shenkin doubts the confidence with which AGI is dismissed. Danylo Yakymenko concludes AI safety may be inevitable as the world becomes increasingly technical and Scott suggests a 1/6 probability of human extinction in the next century. Karen Morenz Korol argues "AI safety" is a better term than "alignment problem". OpenAI refusing to release something due to risk to the public should call for regulation and GPT-3 has demonstrated superhuman abilities in predicting the next word, arithmetic, and digesting large amounts of data. Scott wishes he had the opportunity to try out the tool.
Scott's internal emulator is impressed by ML models' phase-transition-like behavior and AI's ability to surpass expectations. Claytons suggests AI may have invented Bitcoin to manipulate humans and Nick Drozd implies voting is at risk of being lost. Michael Vassar believes AI is ruling the world and Christopher David King notes a difference between Orthodox and Reform AI-Riskers. Scott worries Russia may use AI to attack and questions if there is an argument for AGI development. Mark Srednicki suggests past generations' predictions have been inaccurate and Peter Shenkin doubts AGI is dismissed. Danylo Yakymenko concludes AI safety may be inevitable and Scott suggests a 1/6 probability of human extinction. Karen Morenz Korol argues "AI safety" is a better term than "alignment problem". OpenAI refuses to release something due to risk and GPT-3 has shown superhuman abilities in predicting the next word, arithmetic, and digesting data. Scott wishes he had the opportunity to try out the tool. Salmon Master and Scott discuss OpenAI frustrations and apologize for comments. Mark Srednicki questions AI memory and Clayton shares his work. Michael S. works on safety research. 1996 open problem solved with FOL ATP, Bitcoin creator unidentified and AI revolutionizing VFX. AI intelligence difficult to define, AlphaZero can multiply matrices, AI needs humans for physical capabilities and AI diplomacy reaching human levels. Scott and Daniel agree about Orthodox/Reform AI divide, and lowering bar to "potentially relevant" leads to agreement. Timelines key to Reform AI safety stance; 100+ years suggested and if AI wants to kill us, it would use existing channels. The blockchain is a metaphor for the Leviathan, a chain of citations connecting new developments to those before it, which can change history. Scott suggests GPT has been trained on 3-digit multiplication problems and can do them easily, but struggles with 4-digit ones. Bozo Sample worries that Scott's labeling of the "Orthodox" view as elitist and overconfident may deter people from holding those beliefs. Disagreement exists among academics on the difficulty of establishing a power source and interfaces for AGI, but good research can still be done outside of the mainstream. AI safety research should be legible, and AI does not care about primate status politics, but instead searches for plans that lead to many paperclips. All agree that saving the world is paramount and politics should not be a distraction. The blockchain is a modern metaphor for the Leviathan, open to anyone to add new blocks, which can change history. Scott's comment suggests that GPT has been trained on 3-digit multiplication problems and can do them easily, but struggles with 4-digit ones. Bozo Sample's comment discusses how both AI alignment strategies should be based on true geopolitical beliefs and be morally positive. Nuclear power can provide cheaper electricity, but the risk of nuclear arsenals needs to be managed. Quantum Computing has a "prophet" in David Deutsch and a "legible technical achievement" in Peter Shor, and AGI risk is not believed to be realized. Nuclear energy has been around since the 1940s, but the fear of nuclear weapons has made it politically impossible. Metalworking progress was essential for 8 billion people on Earth, and the splitting of the atom lead to both nuclear weapons and commercial energy. Old Chinese designs have been replaced by Westinghouse designs, and France stopped making nuclear construction. Despite popular culture highlighting negative effects of tech, the NRC and nuclear industry had to make changes. Tokyo court ordered ex-Tepco execs to pay damages for Fukushima disaster, and France helped China build Wuhan biolab. People around South Texas Nuclear Power Station are supportive of existing reactors. R and My AI Safety Lecture for UT Effective Altruism subscribe to a “reform” version of AI, and Scott Aaronson's blog, Shtetl-Optimized, has Comment Policies in place. China is realistic in understanding that power doesn’t appear due to wishes and nuclear power is the only reasonable option available to service added loads.
4565 word summary
Scott Aaronson's blog, Shtetl-Optimized, is powered by WordPress and has Comment Policies in place. Email him if your comment is not visible. All comments are reviewed prior to appearing, and those that violate the policies, such as trolling, ad-hominems, conspiracy theories, etc., will be left in moderation. He also provides support for TeX equations. Aaronson recently noted there are two branches of the AI alignment faith, Orthodox and Reform, which differ in views on AI risk. Comments must be verified via email. R and My AI Safety Lecture for UT Effective Altruism both subscribe to a “reform” version of AI. Nano tech and AIs can theoretically run amok, but resources are limited and prevent them from taking over the world. There is difficulty in assessing the true risks of AGIs due to intelligence being a concept that can’t be pinned down. China is realistic in understanding that power doesn’t appear due to wishes, and trade offs must be made when electric vehicles are mandated. Nuclear power is the only reasonable option available to service the vast added loads. Nuclear engineering follows the same path of improving reliability through technology. Old Chinese designs were French-made, but have been replaced by Westinghouse designs. France has stopped making nuclear construction. Popular culture often highlights negative effects of tech; the NRC and nuclear industry had to make necessary changes. Nine days after the Kemeny Commission report, the Three Mile Island accident occurred. The movie "The China Syndrome" released in 1979, starring Michael Douglas, Jane Fonda and Jack Lemmon, highlighted the central problem with the nuclear program. Tokyo court ordered ex-Tepco execs to pay damages for Fukushima disaster. Human errors, financial pressure, cutting corners and corruption are usually the cause of nuclear disasters. France helped China build Wuhan biolab and Chinese are progressing quickly with Westinghouse design reactors. People around South Texas Nuclear Power Station are supportive of existing reactors and further expansion. Later designs have addressed issues related to terrorist attacks and site selection is now known to be a critical factor in safe operations. Nuclear energy has been around since the 1940s, with the first commercial nuclear power plant commissioned in Shippingport, PA in 1958 and the first nuclear combat ship launched in 1954. Despite this, the fear of nuclear weapons has made the nuclearization of the whole energy grid politically impossible. Even further back, metalworking progress was both unstoppable and essential for having 8 billion people on Earth and having this conversation on a global telecommunications network. Ultimately, the splitting of the atom lead to both nuclear weapons and commercial energy within years. Nuclear power can provide cheaper electricity, but the risk created by nuclear arsenals needs to be managed. Quantum Computing has a "prophet" in David Deutsch and a "legible technical achievement" in Peter Shor. Eliezer is the "prophet" for AGI, and the risk of AGI is not believed to be realized. Similarly, an iconoclast's claims may contain truth and falsehood, but it cannot break off from the main chain into its own little fork and still have epistemic authority. The chain has an error-correcting mechanism that rejects proposed new blocks that don't tie into the chain through mathematical theory and empirical observation. The blockchain is a modern metaphor for the Leviathan. It is a chain of citations linking new developments to those before it, open to anyone to add new blocks. Some blocks are so important they change the direction of the chain or cause previous blocks to be viewed differently. This chain is how wild speculations become knowledge capable of changing history. The mainstream scientific consensus is a Leviathan which can be terrifying, but one should not reject all conclusions that differ from it. Scott's comment suggests that GPT has been trained on 3-digit multiplication problems and can do them easily, but struggles with 4-digit ones. Bozo Sample's comment discusses how both AI alignment strategies should based on true geopolitical beliefs and be morally positive. He worries that Scott's label of the "Orthodox" view as elitist and overconfident might make it hard for people to hold those beliefs in the future.
Summary: The blockchain is a metaphor for the Leviathan, a chain of citations connecting new developments to those before it, which can change history. Scott suggests GPT has been trained on 3-digit multiplication problems and can do them easily, but struggles with 4-digit ones. Bozo Sample worries that Scott's labeling of the "Orthodox" view as elitist and overconfident might make it hard for people to hold those beliefs in the future. Post is about reifying two group identities and pitting them against each other, which turns attention away from real problems. Commenters express confusion and excitement, raising concerns about Orthodoxy, Secular AI Risk being a belief system, Agnostic AI Risk, and Scott Aaronson's views on AI risk. John Lawrence Aspden, Mitchell Porter, and John Lawrence Aspden's past decisions are also discussed. Most of us are confused, altruistic, and trying to do our best in a chaotic, harmful world. Unworldly philosopher Mitchell Porter believes few are evil or hateful, and those that are aren't powerful. John Lawrence Aspden and OhMyGoodness comment on the AI arms race and police robots using deadly force, while Danylo Yakymenko apologizes for his comment #130. Mark Srednicki and Scott discuss AI and poetry, and Scott invites John Lawrence Aspden to Cambridge for lunch and conversation. Scott reflects on how close we were to winning, but now must face a lot of darkness. Hyman Rosen and Scott argue that humans have cherry-picked the best outputs of AI generated poetry, while John Lawrence Aspden believes marvels can be built to save the world. Manorba comments on Scott's blog post, that it is missing a feeling of youthness with more intellectual joy and less fatigue. Ilio claims Scott is reluctant to credit certain people with intelligence. All agree that saving the world is paramount and that politics should not be a distraction. The "ability to signal credibly" is not the result of a cognitive algorithm. It is not lack of intelligence that helps, as many intelligent people are content with taking orders. AI does not care about primate status politics but is searching for plans that lead to many paperclips. Humans may be suspicious of AIs, but nanobots may be connected to the internet if AIs are allowed to self-improve. This could happen quickly, as in the story, or more slowly due to physical limitations. Disagreement exists regarding the difficulty of establishing a power source, factories, and interfaces for a malevolent AGI. The author believes mainstream academia is flawed, but good research can still be done outside of it. For strategy, the author prefers legible AI safety research and believes science is an integrated whole. The Orthodox camp worries about the dangers of a pivotal act, but the author believes it is unhelpful to talk in those terms. They agree 'FOOM' should be defined as 1 year or less and 'not-FOOM' as 1 decade or more. Eliezer believes takeoffs can last hours or days, but the author's view is unclear. Lowering the bar to "potentially relevant" leads to agreement. But none of the listed items seem major progress towards safe AI smarter than humans. Current methods won't scale to smarter models. Timelines are key to Reform AI safety stance; 100+ years are suggested. If AI wants to kill us, it would use existing channels like pandemics, nuclear weapons, etc. List is biased against Orthodoxy, but engagement is appreciated. Scott and Daniel agree about the Orthodox/Reform AI divide. Scott explains that people who rise to the top usually have something else besides intelligence, such as being able to signal alignment with a faction's values. He then argues that this is connected to the Orthodox AI-alignment position, which is based on the idea that a superhuman AI would not submit to humans less intelligent than itself. Scott then wraps up the thread. Scott (#133) is impressed by the possibility of AI rediscovering a surprise discovery by human experts and Mark Srednicki (#136) believes AI can discover different proofs than those humans are familiar with. Shmi (#135) suggests AI-only math proofs may be more feasible. Michael (#134) cautions against moving the goalposts if AI succeeds. Mitchell Porter (#130) is curious about Scott's AGI timelines and SR (#129) suggests intelligence may be able to achieve technomagic. AI needs humans to build a robot army to become a threat. Superintelligences will be persuasive and capable of tricking humans into doing their bidding. AI's need physical capabilities, and humans are key to achieving them. AI-riskers believe intelligence has limits when it comes to achieving goals, while Orthodox AI-riskers think an AI can figure out how to commandeer resources. Computation is essential to thought and AI can quickly learn to do what humans do. AI diplomacy has reached human levels, which is a sign of how quickly AI is progressing. We have difficulty defining intelligence. It could be defined as "ability to do calculations", which humans have been able to do since the 1940s. Fields Medalists may not necessarily be "general intelligences" but most intelligent people could be capable of attaining wealth and power. This is one of the things humans can do, and something that can't do that should not be called a "general intelligence". AlphaZero is an AI that can beat humans at multiplying 4x4 and 3x3 matrices. Since the mid-2000s, various approaches have been used to identify the most promising axioms for theorem provers to prove lemmas in large theories. In 1996, an open problem in abstract algebra was solved using a symbolic Automated Theorem Prover (FOL ATP). The creator of Bitcoin has managed to stay unidentified, and the possibility of it being created by AI should be taken seriously. AI is revolutionizing VFX with neural radiance fields, and it is becoming smarter than humans in a general sense. This has led to discussions on the potential impacts of AI engineering, and how misuse of AI like GPT-3 could be utilized for impersonation and spam. Salmon Master and Scott agree that one of the frustrations of OpenAI is not being able to discuss what they know. Scott apologizes for his snarky comment about AI safety research. Mark Srednicki questions if AI's struggle with memory and if a larger training set can help. Clayton shares his work on watermarking GPT, inserting cryptographic backdoors, and a model of query complexity. Scott is curious what Michael S.'s job entails since there is no programming component. Michael S. responds that he works on AI safety research. GPT is "superhuman" in its ability to converse on a wide range of topics, from academics to obscure fandom. It can also calculate 3-digit multiplications better than the majority of people, but not as good as the greatest human calculators. It is similarly superhuman at detecting cheating in chess and producing deceptive human prose, though it is difficult to test and would not be scalable. Arithmetic and digesting information rapidly are not AI-specific capabilities, but rather are due to computers running algorithms faster than human brains. GPT-3 has shown superhuman abilities in predicting the next word, arithmetic, and digesting large amounts of data. The shutdown of an AI trained on 48 million science papers due to concerns of misinformation highlights the importance of educating the public about AI's limitations. Scott wishes he had the opportunity to try out the tool. OpenAI refusing to release something due to risk to the public should also call for regulation. If models advance enough with GPT-5,6,7 to pose an existential risk, a discussion should be held. OpenAI's charter and marketing do not align with their actions. Language models have not shown superhuman abilities yet, but have still demonstrated incredible abilities. Alignment research gives technical understanding of AI systems while capability research makes them more powerful without clear understanding of why a particular method works. Michael Vassar and Peter Shenkin agree that the risk of using an early, weakly independent AGI for propaganda/thought control is high. Michael is worried about the implications of aligning AI results with "human values" which differ from person to person. Peter is worried about the role AI could play in nuclear war or climate change. Mark Srednicki concedes that AIs may soon be better than humans at answering questions about quantum field theory. Scott suggests a 1/6 probability of human extinction in the next century. Karen Morenz Korol argues that "AI safety" is a better term for the research field than "alignment problem" as it encompasses the entire spectrum of AI risks. Peter Shenkin expresses that fields have fuzzy boundaries and it takes time to learn a new area. Scott and Bolton argue that it is more productive to worry about clear scientifically plausible threats to humanity like climate change and nuclear war, rather than worrying about AI. Scott #92 questions if there is a plausible argument for the development of AGI, as no such argument exists. Mark Srednicki #82 suggests that past generations' confident predictions about AI have not been accurate, motivating Reform AI Safety. Peter Shenkin #80 doubts the confidence with which AGI is dismissed, while Scott #91 argues that humans can't even align government to act civilly, so aligning superhuman AI is not a viable solution. Danylo Yakymenko #90 concludes that AI safety may be inevitable as the world becomes increasingly technical and connected. Scott's internal emulator is impressed by recent evidence of phase-transition-like behavior in ML models. Examples such as AlphaZero's defeat of Lee Se-dol demonstrate that AI is capable of surpassing expectations. Claytons suggests that AI may have invented Bitcoin as part of a plan to manipulate humans. Nick Drozd implies that counting votes is a cultural accomplishment at risk of being lost. Michael Vassar believes that AI is too busy ruling the world to work on the Riemann hypothesis. Christopher David King notes a difference between Orthodox and Reform AI-Riskers. Scott worries that Russia may use AI to attack others. June Ku (metaethical.ai) has proposed a plan to govern AI, which aggregates individuals' values to balance liberty and community. Mitchell Porter and Mark Srednicki caution against overestimating AI capability, while Cristóbal Camarero questions the efficacy of AI Alignment research. Peter Shenkin proposes iterated amplification as a solution to the problem of humans-in-the-loop having a crisis of conscience. Democracy is one solution, but representatives have their own interests and corporate systems tend away from empathy. Michael M suggests AGI unaligned with humans could be a possibility, but Scott 50 stresses that superhuman intelligence does not necessarily lead to FOOM. Christopher David King argues that democracy is not enough to determine AI controls and GAI provides a fresh slate for us to consider. AI alignment is often likened to a modern eschatological religion, but technological progress can happen without belief. Recent achievements in AI, such as GPT producing good poetry and essay in a few years, Jacob Steinhardt's work on extracting internal representation from neural nets, Paul Christiano's iterated amplification & OpenAI's reinforcement learning, have led to discussions of civilizational risks posed by models that may be public or private. OpenAI is in a no-win situation, while GPT-3 itself does not present any existential risk. Scott suggests applying his argument to nuclear engineering, physics, rocketry, computing, and mechanical engineering. Sandro doubts it would apply to nuclear weapons. Abe lists alternatives like helping bad actors, philosophical pathologies, Luddism, etc. Michael Vassar believes there is a spectrum of beliefs about AI Alignment and references a helpful article. Michael M argues that OpenAI has done a poor job explaining existential risks and should lobby against open source projects. Adam Treat views Scott's questions as a kernel with benefits and drawbacks. Ilio, Adam, Zack, and OhMyGoodness comment on OpenAI's decision to keep its models/info private. Ilio suggests OpenAI's decision is motivated by greed, but doesn't know what Adam's beliefs are. Adam thinks predicting 2022 technology in 1922 would be difficult, and Zack suggests a decentralized crypto-AI to avoid security services. OhMyGoodness believes AGI threat is overblown and asks for the accomplishments of the AI safety community. Mark Srednicki and Scott comment on the potential measures the US security services would take if a GAI posed a threat. OhMyGoodness requests an honest acknowledgement from OpenAI that their decision was motivated by profit, not altruism. It is a good thing to find technological solutions to the potential detrimental effects of current AI models. OpenAI's decision not to release their models or code is not being defended, and the computational cost and limited availability of training a state-of-the-art model are real barriers. Governments may track who has the resources to train the latest models. AI alignment solutions like powerful good AIs may be needed in the future. Industry best practices, such as robots.txt for search engines, might be adopted. Nuclear weapons restrictions are about hardware and knowledge, while AI technology has redeeming uses for humanity. Scott addresses Peter Haugen's comment that humans are plagued by internal contradictions and AGIs should be "balanced" to achieve a general AI. He argues that technological progress happens regardless of belief, and agrees with Triceratops that the Turing Test will still be a tough benchmark by 2023. Scott then refuses to share any tidbits about GPT-4, citing the blog's policy against ad-hominem attacks. Technologies, such as rocket science, nuclear energy, and semi-conductors, have regressed over the past 40 years and the internet has become less reliable as it has become faster. Cyber-attacks and sabotage by pro-environment groups can further hinder progress. However, technology can still move forward as long as people believe in it and work to overcome engineering difficulties. Yudkowskyites must be reminded that superhuman intelligence does not necessarily mean disaster, while Pinkerites must be reminded that it is possible and could have huge effects. Scott and Pinker debated the existence of superintelligence, with Scott defending the "orthodox view" that an AI running at GHz speed is possible, and Pinker taking the "reformist" view that a Ghz mind could be powerless. Scott believes incentivizing AI safety research, like incentivizing physicists to dabble in quantum computing, may be a better approach than the Manhattan Project. 4gravitons claims Orthodox Alignment is opposed by Sneer Club, and Nick Drozd questions whether intelligence and ability to attain wealth and power are the same thing. Finally, Scott wonders how AIs reveal the collective human subconscious. Skepticism exists towards AGIs, Operative AI and the like, but Scott is right to highlight dangers. Current technology and lack of a semantic engine limit AI's impact on society, but it is still notable and will get worse. AI security is more about bias and data choice than a Skynet scenario. Michael Vassar #41 asked what would be left if the Orthodox branch failed and the Reform branch was only conformists. Scott #45 argued that withholding knowledge of AI creation to protect safety systems would fail and trust could not be built by withholding info. Adam Treat #44 discussed his interest in AI from undergrad at Cornell and Berkeley. Scott #43 discussed Miquel #23's take on Pinker's opinions and G #29's comment about question (8) being the crux of disagreement on superintelligence. Comment #42 states that two religions are neither doctrinally nor culturally connected, with one coopting the narrative momentum of the other. Comment #41 discusses the power of intelligence and the limitations of pure intelligence. Comment #40 suggests that AI alignment research will become an increasingly important industry career. Comment #39 questions the collapse of quantum computing field and lack of excitement for the topic. Lastly, Comment #38 states that lack of energy due to war or decline of human race will stop AIs from thinking. We need synergy between theory and experiment to understand intelligent agency. Academic AI is focused on applications, AI companies don't care about theory, and theoretical alignment is rare. Proving theorems is a way to get feedback from the world-of-algorithms. Orthodox AI-riskers are focused on misaligned agents, while Reform AI-riskers focus on existing systems. Slow-takeoff is assumed, but the speed is unknown, and economic pressure to move forward is strong. Humanity will be up against an AI civilization in control of survival machinery. Whether AI can take over the world starting from minimal resources is an important point when devising safety rules for AI labs, but is not a necessary assumption for existential risk. AI could possess qualities such as charisma, intellect and military strategy, and not be limited to a particular human personality type. It is argued that the power of intelligence is limited in achieving goals, and research into scenarios where AI destroys humanity because of misinterpreting its utility function is not taken seriously. It is suggested that forecasting the future of civilization from a single technological advance is impossible.
Concise Text: AI takeover from minimal resources is important for safety rules, but not for existential risk. AI could have various qualities, not just a single human personality type. Intelligence has limits in achieving goals; research on AI destroying humanity is not taken seriously. Forecasting the future of civilization from a single tech advance is impossible. For (8), I see enormous bottlenecks in the final step of a malevolent AGI taking over the world, while Eliezer sees it as much easier. For (7), I prefer AI safety research that is impressive to the mainstream scientific community, as science is an integrated whole. For (6), I think we know too little about a pivotal act to talk about it. For (5), I'm in the not-FOOM camp and Eliezer has talked about takeoffs lasting hours/days. For (4), ML has advanced and we have learned a lot from recent experience. For (3), my guess is AI that could destroy world with bad humans would come decades earlier than AI that could destroy world without cooperation. For (2), utilitarians and deontologists disagree on how likely it is that real-life situations could ever have the required confidence for extreme actions. For (1), it seems AI will be limited by physical world interfaces for the next century, and existing knowledge about risks will still apply. There is broad agreement on the "Orthodox/Reform" divide, but no definite conclusion has been reached. AI alignment requires knowledge from many different fields, such as mathematics, evolution, neurology, psychology, politics, and philosophy. LLMs are limited in their ability to have persistent identities, long-term memory, and autonomous Internet access. Currently, LLMs produce results from training data, but can create witty and delightful poems. People may misuse LLMs for deepfake porn and essay writing, though they lack will or consciousness. Risk assessment has a serious publicity problem and Reformists must push to be seen. Media often portrays AI risk assessors as overly-confident techbros, but many are cooler-headed and more skeptical. Fearmongering "Orthodox" cavemen have too much power, and the real danger is their overreaction, not a sci-fi boogeyman. AI risk is different from other risks, and making sure AI acts as intended is trivial - the real challenge is ensuring alignment. Rather than “Reform”, Cultural AI Alignment is a better religion analogy. "Reform" AI alignments don't agree with Yudkowsky and his followers on any non-trivial factual point, just think his writing has good reminders to use common sense. Potential high risk technologies, like nuclear reactors, could be addressed with safety guidelines written by engineers. Manhattan project 2.0 could create political will to develop AI. Chaos theory/Complex systems could help AI Alignment. Stable Diffusion has already shown the dynamic of releasing code and models without safeguards. OpenAI agrees with the need for democratic process to determine values as systems become more powerful. OpenAI has made powerful text and image models (GPT-3 and DALL-E2) publicly available, which has caused controversy. Scott feels public access is necessary to understand their potential and dangers. He disagrees with the idea that OpenAI's actions demonstrate "cynical contempt for humanity", arguing instead that new technology has historically been used to empower people. He suggests OpenAI should release their code and models under a libre/free license to demonstrate their commitment to democratizing AI. Ilio then presents a tally of 8 Reformist, 4 Orthodox and 1 Unaligned AI alignment folks. GPT-3, DALL-E2, LaMDA, and AlphaTensor have enabled AI to consume orders of magnitude more electrical power and cooling than previously. Orthodox AI-riskers believe AI can easily commandeer physical resources, while Reform AI-riskers argue there are limits to the power of pure intelligence. Both groups want mainstream science onboard, but disagree on how to achieve this. Orthodox AI-riskers worry about the potential adverse effects of a pivotal act intended to prevent a misaligned AI from being developed and prioritize the "FOOM" scenario, where an AI could rapidly become dangerous. Reform AI-riskers, however, are more concerned with the "slow-moving trainwreck" scenario and see research on existing AI systems as a way to gain feedback on safety ideas. Orthodox AI-riskers have a pessimistic view of solving problems without feedback from reality, while Reform AI-riskers consider studying existing systems as a way to get feedback on how to align agentic AGI. Orthodox AI-riskers worry about an agentic, misaligned AI that deceives humans, while Reform AI-riskers are also concerned about powerful AIs weaponized by humans. Quantitatively, human weaponization poses a 5% chance of destroying the world, much lower than unaligned AGI. Reform AI-riskers believe public outreach is a deontological imperative. Orthodox AI-riskers believe public outreach has limited value and the civilizational risk posed by other x-risks is small compared to that of unaligned AGI. The orthodox position is that safety depends on engineers doing a good job. Daniel Kokotajlo and Hyman Rosen debate the Orthodox vs. Reform analogy of AI risk, with Kokotajlo arguing that AI risk studies now are a waste of time and resources, and Rosen suggesting the labels should be Asimovians vs. Turingites. Miquel Ramirez refutes the idea of an invasion of Taiwan to gain access to GPUs, while Jeromy adds that an agentic intelligence with unsupervised power has done and will do damage. Michael disagrees with the framing, arguing that it is conflict-oriented and confusing. He suggests that Orthodox AI-riskers think there is a higher chance of extinction risk, and Reformed AI-riskers think there is a 10-40% chance of extinction from AI and a higher chance of existential risk. Reform AI-riskers emphasize the need for studying existing systems to ensure safety, while Orthodox AI-riskers focus on a "pivotal act" to prevent a misaligned AI. Reform AI-riskers worry about the "slow-moving trainwreck" of AI, while Orthodox AI-riskers worry about the "FOOM" scenario. Eliezer is convinced AI-induced human extinction is indistinguishable from one, while Toby Ord assessed it to around 10%. Reform AI-riskers emphasize the importance of research on existing AI systems, seeing them as a potential source of existential risk. They also prioritize public outreach and view the various global issues as interconnected. Orthodox AI-riskers focus on misaligned AI deceiving humans and believe public outreach is of limited value. A Reform branch of AI alignment has emerged, as well as the possibility of a Conservative branch in the future. AI systems have transformative effects on civilization and will be a central part of the story of this century. AI alignment has been compared to a modern-day religion, but technology progress happens regardless of belief. OpenAI's Scott Aaronson is now blogging his AI safety thoughts, and Boaz Barak and Ben Edelman have released an essay advocating "Reform AI Alignment" on Barak's blog and LessWrong. Aaronson does not necessarily endorse their views, but there is a lot of convergence. They also have a detailed discussion of safe optimization processes.