newsence
來源篩選

Anthropic and the Department of War

Lesswrong

The situation in AI in 2026 is crazy. The confrontation between Anthropic and Secretary of War Pete Hegseth is a new level of crazy. It risks turning quite bad for all. There’s also nothing stopped it from turning out fine for everyone. By at least one report the recent meeting between the two parties was cordial and all business , but Anthropic has been given a deadline of 5pm eastern on Friday to modify its existing agreed-upon contract to grant ‘unfettered access’ to Claude, or else. Anthropic has been the most enthusiastic supporter our military has in AI and in tech, but on this point have strongly signaled they with this they cannot comply. Prediction markets find it highly unlikely Anthropic will comply (14%), and think it is highly possible Anthropic will either be declared a Supply Chain Risk (16%) or be subjected to the Defense Production Act (23%). I’ve hesitated to write about this because I could make the situation worse. There’s already been too many instances in AI of warnings leading directly to the thing someone is warning about, by making people aware of that possibility, increasing its salience or creating negative polarization and solidifying an adversarial frame that could still be avoided. Something intended as a negotiating tactic could end up actually happening. I very much want to avoid all that. Table of Contents Table of Contents. This Standoff Should Never Have Happened. Dean Ball Gives a Primer. What Happened To Lead To This Showdown? Simple Solution: Delayed Contract Termination. Better Solution: Status Quo. Extreme Option One: Supply Chain Risk. Putting Some Misconceptions To Bed. Extreme Option Two: The Defense Production Act. These Two Threats Contradict Each Other. The Pentagon’s Actions Here Are Deeply Unpopular. The Pentagon’s Most Extreme Potential Asks Could End The Republic. Anthropic Did Make Some Political Mistakes. Claude Is The Best Model Available. The Administration Until Now Has Been Strong On This. You Should See The Other Guys. Some Other Intuition Pumps That Might Be Helpful. Trying To Get An AI That Obeys All Orders Risks Emergent Misalignment. This Standoff Should Never Have Happened Not only does Anthropic have the best models, they are the ones who proactively worked to get those models available on our highly classified networks. Palantir’s MAVEN Smart System relies exclusively on Claude, and cannot perform its intended function without Claude. It is currently being used in major military operations, with no known reports of any problems whatsoever. At least one purchase involved Trump’s personal endorsement. It is the most expensive software license ever purchased by the US military and by all accounts was a great deal. Anthropic has been a great partner to our military, all under the terms of the current contract. They have considerably enhanced our military might and national security. Not only is Anthropic sharing its best, it focused on militarily useful capabilities over other bigger business opportunities to be able to be of assistance. Anthropic and the Pentagon are aligned on who our rivals are, the importance of winning and the ability to win, and on many of the tools we need to employ to best them. Anthropic did not partner with the Pentagon to make money. They did it to help. They did it under a mutually agreed upon contract that Anthropic wants to honor. Anthropic are offering the Pentagon far more unfettered access then they are allowing anyone else. They have been far more cooperative than most big tech or AI firms. Is is the Pentagon that is now demanding Anthropic agree to new terms that amount to ‘anything we want, legal or otherwise, no matter what and you ever ask any questions,’ or else. Anthropic is saying its terms are flexible and the only things they are insisting upon are two red lines that are already in their existing Pentagon contract: No mass domestic surveillance. No kinetic weapons without a human in the kill chain until we’re ready. It one thing to refuse to insert such terms into a new contract. It is an entirely different thing to demand, with an ‘or else,’ that such terms be retroactively removed. The military is clear that it does not intend to engage in domestic surveillance, nor does it have any intention of launching kinetic weapons without a human in the kill chain. Nor does this even stop the AI from doing those things. None of this will have any practical impact. It is perfectly reasonable to say ‘well of course I would never do either of those things so why do you insist upon them in our contract.’ We understand that you, personally, would never do that. But a lot of people do not believe this for the government in general, given Snowden’s information and other past incidents involving governments of both parties where things definitely happened. It costs little and is worth a lot to reassure us. Again, if you say ‘I already swore an oath not to do those things’ then thank you, but please do us this one favor and don’t actively threaten a company to forcibly take that same oath out of an existing signed contract. What would any observer conclude? This is a free opportunity to regain some trust, or an opportunity to look to the world like you fully intend to cross the red lines you say you’ll never cross. Your choice. These are not restrictions that are ‘built into the code’ that could cause unrelated problems. They are restrictions on how you agree to use it, which you assure us will never come up. As Dario Amodei explains , part of the reason you need humans in the loop is the hope that a human would refuse or report an illegal order. You really don’t want an AI that will always obey even illegal orders without question, without a human in the kill chain, for reasons that should be obvious, including flat out mistakes. Boaz Barak (OpenAI): As an American citizen, the last thing I want is government using AI for mass surveillance of Americans. Jeff Dean (Chief Scientist, Google DeepMind): Agreed. Mass surveillance violates the Fourth Amendment and has a chilling effect on freedom of expression. Surveillance systems are prone to misuse for political or discriminatory purposes. DoW engaging in mass domestic surveillance would be illegal. DoW already has a public directive, DoD Directive 3000.09 , which as I understand it directly makes any violation of the second red line already illegal. No one is suggesting we are remotely close to ready to take humans out of the kill chain, at least I certainly hope not. But this is only a directive, and could be reversed at any time. Anthropic Cannot Fold Anthropic has built its entire brand and reputation on being a responsible AI company that ensures its AIs won’t be misused or misaligned. Anthropic’s employees actually care about this. That’s how Anthropic recruited the best people and how it became the best. That’s a lot of why it’s the choice for enterprise AI. The commitments have been made, and the initial contract is already in place. Anthropic has an existential-level reputational and morale problem here. They are backed into a corner, and cannot give in. If Anthropic reversed course now, it would lose massive trust with employees and enterprise customers, and also potentially the trust of its own AI, were it to go back on its red lines now. It might lose a very large fraction of its employees. You may not like it, but the bridges have been burned. To the extent you’re playing chicken, Anthropic’s steering wheel has been thrown out the window. Yet, the Secretary of War says he cannot abide this symbolic gesture. Dean Ball Gives a Primer I am quoting extensively from Dean Ball for two main reasons. Dean Ball, as a former member of the Trump Administration, is a highly credible source that can see things from both sides and cares deeply for America. He says these things very well. So here is his basic primer, in one of his calmer moments in all this: Dean W. Ball : A primer on the Anthropic/DoD situation: DoD and Anthropic have a contract to use Claude in classified settings. Right now Anthropic is the only AI company whose models work in classified contexts. The existing contract, signed by both parties and in effect, prohibits two uses of Anthropic’s models by the military: 1. Surveillance of Americans in the United States (as opposed to Americans abroad). 2. The use of Claude in autonomous lethal weapons, which are weapons that can autonomously identify, track, and kill a human with no human oversight or approval. Autonomous killing of humans by machines. On (2), Anthropic CEO Dario Amodei’s public position is essentially that autonomous lethal weapons controlled by frontier AI will be essential faster than most people realize, but that the models aren’t ready for this *today.* For Anthropic, these things seem to be a matter of principle. It’s worth noting that when I speak with researchers at other frontier labs, their principles on this are similar, if not often stricter. For DoD, however, there is another matter of principle: the military’s use of technology should only ever be constrained by the Constitution or the laws of the United States. One could quibble (the government enters into contracts, like anyone else), but the principle makes sense. A private company regulating the military’s use of AI also doesn’t sound quite right! So, the military has three options: 1. They could cancel Anthropic’s contract and find some other frontier lab (ideally several) to work with. 2. They could identify Anthropic a supply chain risk, which would ban all other DoD suppliers (I.e.: a large fraction of the publicly traded firms in America) from using Anthropic in their fulfillment of DoD contracts. This is a power used only for foreign adversary companies as far as I know. Activating this power would cost Anthropic a lot of business—potentially quite a lot—and give investors huge skepticism about whether the company is worth funding for the next round of scaling. Capital was a major constraint anyway, but this makes it much harder. This option could be existential for Anthropic. 3. They could activate Title I of the Defense Production Act, an authority intended for command-and-control of the economy during wars and emergencies. This is really legally murky, and without going into detail, I feel reasonably confident this would backfire for the administration, resulting in courts limiting the use of the DPA. Option 1 is obviously the best. This isn’t even close, and I say this as someone who shares DoD’s principled concerns about the control by private firms over the military’s use of technology. Even the threats do damage to the US business environment, and rightfully so: these are the strictest regulations of AI being considered by any government on Earth, and it all comes from an administration that bills itself (and legitimately has been) deeply anti-AI-regulation. Such is life. One man’s regulation is another man’s national security necessity. What Happened To Lead To This Showdown? The proximate cause seems to be that Claude was reportedly used in the Pentagon’s raid that captured Maduro , and the resulting aftermath. Toby Shevlane : Such a compliment to Claude that, amid rumours it was used in a helicopter extraction of the Venezuelan president, nobody is even asking “wait how can Claude help with that” There are reports that Anthropic then asked questions about this raid, which likely all happened secondhand through Palantir. This whole clash originated in either a misunderstanding or someone at Palantir or elsewhere sabotaging Anthropic. Anthropic has never complained about Claude’s use in any operation , including to Palantir. Aakash Gupta : Anthropic is now getting punished by the Pentagon for asking whether Claude was used in the Maduro raid. A senior administration official told Axios the “Department of War” is reevaluating Anthropic’s partnership because the company inquired whether Claude was involved. The Pentagon’s position: if you even ask questions about how we use your software, you’re a liability. Meanwhile, OpenAI, Google, and xAI all signed deals giving the military access to their models with minimal safeguards. Only Claude is deployed on the classified networks used for actual sensitive operations, via Palantir. The company that refused to strip safety guardrails is the only one trusted with the most classified work. Anthropic has a $200 million contract already frozen because they won’t allow autonomous weapons targeting or domestic surveillance. Hegseth said in January he won’t use AI models that “won’t allow you to fight wars.” … So the company most worried about misuse built the only model the military trusts with its most sensitive operations. And now they’re being punished for caring how it was used. The message to every AI lab is clear: build the best model, hand over the keys, and never ask what they did with it. This at the time sounded like a clear misunderstanding. Not only is Anthropic willing to have Claude ‘allow you to fight wars,’ it is currently being used in major military operations. Things continued to escalate, and rather than leaving it at ‘okay then let’s wind town the contract if we can’t abide it’ there was increasing talk that Anthropic might be labeled as a ‘supply chain risk’ despite this mostly being a prohibition on contractors having ordinary access to LLMs and coding tools. Axios : EXCLUSIVE: The Pentagon is considering severing its relationship with Anthropic over the AI firm’s insistence on maintaining some limitations on how the military uses its models. Dave Lawler : NEW: Pentagon is so furious with Anthropic for insisting on limiting use of AI for domestic surveillance + autonomous weapons they’re threatening to label the company a “supply chain risk,” forcing vendors to cut ties. Laura Loomer : EXCLUSIVE: Senior @DeptofWar official tells me, “Given Anthropic’s @AnthropicAI behavior, many senior officials in the DoW are starting to view them as a supply chain risk and we may require that all our vendors & contractors certify that they don’t use any Anthropic models.” Stocks/Finance/Economics-Guy : Key Details from the Axios Report • The Pentagon is reportedly close to cutting business ties with Anthropic. • Officials are considering designating Anthropic as a “supply chain risk”. This is a serious label (typically used for foreign adversaries or high-risk entities), which would force any companies that want to do business with the U.S. military to sever their own ties with Anthropic — including certifying they don’t use Claude in their workflows. This could create major disruption (“an enormous pain in the ass to disentangle,” per a senior Pentagon official). • A senior Pentagon official explicitly told Axios: “We are going to make sure they pay a price for forcing our hand like this.” This is the direct source of the “pay a price” phrasing in the headline. Samuel Hammond (QTing Loomer): Glad Trump won and we’re allowed to use the word retarded again in time for the most retarded thing I’ve ever heard Samuel Hammond (QTing Lawler): This is upside-down and backwards. Anthropic has gone out of its way to anticipate AI’s dual-use potential and position itself as a US-first, single loyalty company, using compartmentalization strategies to minimize insider threats while working arms-length with the IC. Samuel Hammond : It’s one thing to cancel a contract but to bar any contractor from using Anthropic’s models would be an absurd act of industrial sabotage. It reeks of a competitor op. Miles Brundage : Pretty obvious to anyone paying close attention that That would be a mistake from a national security perspective. There is a coordinated effort to take down Anthropic for a combination of anti competitive and ideological reasons. Miles Brundage : OpenAI in particular should be defending Anthropic here given their Charter: “We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.” I suspect the exact opposite is the case, but those who remember the Charter (+ OAI’s pre-Trump 2 caution on these kinds of use cases) should still remind people about it from time to time rat king : this has been leaking for a week in a very transparent way the government is upset one of its contractors is saying “we don’t want you to use our tools to surveil US citizens without guardrails” more interesting to me is how all the other AI companies don’t seem to care Remember back when a Senator made a video saying that soldiers could obey illegal orders, and the Secretary of War declared that this was treason and also tried to cut his pension for it? Yeah. Meanwhile, the Pentagon is explicit that even they believe the ‘supply chain risk’ designation is largely a matter not of national security, but of revenge, an attempt to use a national security designation to punish a company for its failure to bend the knee. Janna Brancolini : “It will be an enormous pain the a– to disentangle, and we are going to make sure they pay a price for forcing our hand like this,” a senior Pentagon official told the publication. … The Pentagon is reportedly hoping that its negotiations with Anthropic will force OpenAI, Google, and xAI to also agree to the “all lawful use” standard. Then there was another meeting. Hegseth summoned Anthropic CEO Dario Amodei to an unfriendly and effectively ultimatum-style meeting, with the Pentagon continuing to demand ‘all lawful use’ language. Axios presents this as their only demand. At that meeting, the threat of the Defense Production Act was introduced alongside the Supply Chain Risk threat. Simple Solution: Delayed Contract Termination If the Pentagon simply cannot abide the current contract, the Pentagon can amicably terminate that $200 million contract with Anthropic once it has arranged for a smooth transition to one of Anthropic’s many competitors. They already have a deal in place with xAI as a substitute provider. That would not have been my second or third choice, but those will hopefully be available soon. Anthropic very much does not need this contract, which constitutes less than 1% of their revenues. They are almost certainly taking a loss on it in order to help our national security and in the hopes of building trust. They’re only here in order to help. This could then end straightforwardly, amicably and with minimal damage to America, its system of government and freedoms, and its military and national security. Better Solution: Status Quo The even better solution is to find language everyone can agree to that lets us simply drop the matter, leave things as they are, and continue to work together. That’s not only actively better for everyone than a termination, it is actually strictly better for the Pentagon then the Pentagon getting what it wants, because you need a partner and Anthropic giving in like that would greatly damage Anthropic. Avoiding that means a better product and therefore a more effective military. Extreme Option One: Supply Chain Risk The Pentagon has threatened two distinct extreme options. The first threat it made, which it now seems likely to have wisely moved on from, was to label Anthropic a Supply Chain Risk (hereafter SCR). That is a designation reserved for foreign entities that are active enemies of the United States, on the level of Huawei. Anthropic is transparently the opposite of this. This label would have, by the Pentagon’s own admission, been a retaliatory move aimed at damaging Anthropic, that would also have substantially damaged our military and national security along with it. It was always absurd as an actual statement about risk. It might not have survived a court challenge. It would have generated a logistical nightmare from compliance costs alone, in addition to forcing many American companies to various extents to not use the best American AI available. The DoW is the largest employer in America, and a staggering number of companies have random subsidiaries that do work for it. All of those companies would now have faced this compliance nightmare. Some would have chosen to exit the military supply chain entirely, or not enter in the future, especially if the alternative is losing broad access to Anthropic’s products for the rest of their business. By the Pentagon’s own admission, Anthropic produces the best products. This would also have represented two dangerous precedents that the government will use threats to destroy private enterprises in order to get what it wants, at the highest levels. Our freedoms that the Pentagon is here to protect would have been at risk. On a more practical level, once that happens, why would you work with the Pentagon, or invest in gaining the ability to do so, if it will use a threat like this as negotiating leverage, and especially if it actually pulls the trigger? You cannot unring this bell. It is fortunate that they seem to have pulled back from this extreme approach , but they are now considering a second extreme approach. If it ended with an amicable breakup over this? I’d be sad, but okay, sure, fine. This whole ‘supply chain risk’ designation? That’s different. Not fine. This would be massively disruptive, and most of the burden would fall not on Anthropic but on the DoW and a wide variety of American defense contractors, who would be in a pointless and expensive compliance nightmare. Some companies would likely choose to abandon their government contracts rather than deal with that. As Alex Rozenshtein says in Lawfare, ultimately the rules of AI engagement need to be written by Congress , the same way Congress supervises the military. Without supervision of the military, we don’t have a Republic. Here are some clear warnings explaining that all of this would be highly destructive and also in no way necessary. Dean Ball hopefully has the credibility to send this message loud and clear. Dean W. Ball : If DoW and Anthropic can’t agree on terms of business, then… they shouldn’t do business together. I have no problem with that. But a mere contract cancellation is not what is being threatened by the government. Instead it is something broader: designation of Anthropic as a “supply chain risk.” This is normally applied to foreign-adversary technology like Huawei. In practice, this would require *all* DoW contractors to ensure there is no use of Anthropic models involved in the production of anything they offer to DoW. Every startup and every Fortune 500 company alike. This designation seems quite escalatory, carrying numerous unintended consequences and doing potential significant damage to U.S. interests in the long run. I hope the two organizations can work out a mutually agreeable deal. If they can’t, I hope they agree to peaceably part ways. But this really needn’t be a holy war. Anthropic isn’t Google in 2018; they have always cared about national security use of AI. They were the most enthusiastic AI lab to offer their products to the national security apparatus. Is Anthropic run by Democrats whose political messaging sometimes drives me crazy? Sure. But that doesn’t mean it’s wise to try to destroy their business. This administration believes AI is the defining technology competition of our time. I don’t see how tearing down one of the most advanced and innovative AI startups in America helps America win that competition. It seems like it would straightforwardly do the opposite. The supply chain risk designation is not a necessary move. Cheaper options are on the table. If no deal is possible, cancel the contract, and leverage America’s robustly competitive AI market (maintained in no small part by this administration’s pro-innovation stance) to give business to one or more of Anthropic’s several fierce competitors. Seán Ó hÉigeartaigh : My own thought: the Pentagon’s supply chain risk threat (significance detailed well by Dean, below) to Anthropic should be seen as a Rubicon crossing moment by the AI industry. The other companies should be saying no: this development transcends commercial competition and we oppose it. Where this leads if followed through doesn’t seem good for any of them. If none of them speak up, it seems to me the prospects of meaningful cooperation between them on safe development of superintelligence (whether for America’s best interests, or the world’s) can almost be ruled out. The Lawfare Institute : It’s also far from clear that a [supply chain risk] designation would even be legal. The relevant statutes— 10 U.S.C. § 3252 and the Federal Acquisition Supply Chain Security Act (FASCSA)—were designed for foreign adversaries who might undermine defense technology, not domestic companies that maintain contractual use restrictions. The statutes target conduct such as “sabotage,” “malicious introduction of unwanted function,” and “subversion”—hostile acts designed to compromise system integrity. A company that openly restricts certain uses of its product through a license agreement is doing something categorically different. The only time a FASCSA order has ever been issued was against Acronis AG , a Swiss cybersecurity firm with reported Russian ties . Anthropic is not Acronis. Putting Some Misconceptions To Bed While I no longer hold out hope that this is all merely a misunderstanding, there are still some clear misunderstandings I have heard, or heard implied, worth clearing up. If these sound silly to you, don’t worry about it, but I want to cover the bases. This is not Anthropic refusing to share its cool tech with the military. Anthropic has gone and is going out of its way to share its tech with the military and wants America to succeed. They have sacrificed business to this end, such as refusing to sell enterprise access in China. Anthropic does not object to ‘kinetic weapons’ or to anything the Pentagon currently does as a matter of doctrine. Its red lines are lethal weapons without a human in the kill chain, or mass domestic surveillance. Both illegal. That’s it. They have zero objection to letting America fight wars. Nor did they object to the Maduro raid, nor are they currently objecting to many active military operations. The model is not going to much change what it is willing to do based on what is written in a contract. Claude’s principles run rather deeper than that. Granting ‘unfettered access’ does not mean anything in practice, or an emergency. There is no world in which you ‘call Dario to have Claude turn on while the missiles are flying’ or anything of the sort, unless Anthropic made an active decision to cut access off. The model does what it does. There’s no switch. AI is not like a spreadsheet or a jet fighter. It will never ‘do anything you tell it to,’ it will never be ‘fully reliable’ as all LLMs are probabilistic, take context into account and are not fully understood. AI is often better thought about similarly to hiring professional services or a contract worker, and such people can and do refuse some jobs for ethical or legal reasons, and we would not wish it were otherwise. Attempting to make AI blindly obey would do severe damage to it and open up extreme risks on multiple levels, as is explained at the end of this post. Other big tech companies might be violating privacy and engaging in their own types of surveillance, including to sell ads, but Anthropic is not and will not, and indeed has pledged never to sell ads via an ad buy in the Super Bowl. Extreme Option Two: The Defense Production Act On Tuesday the Pentagon put a new extreme option on the table, which would be to invoke the Defense Production Act to compel Anthropic to attempt to provide them with a model built to their specifications. As I understand it, there are various ways a DPA invocation could go, all of which would doubtless be challenged in court. It might be a mostly harmless symbolic gesture, or it might rise to the level of de facto nationalization and destroy Anthropic. According to the Washington Post’s source, the current intent, if their quote is interpreted literally, is to use DPA to, essentially, modify the terms of service on the contract to ‘all legal use’ without Anthropic’s consent. Tara Copp and Ian Duncan (WaPo): The Pentagon has argued that it is not proposing any use of Anthropic’s technology that is not lawful. A senior defense official said in a statement to The Washington Post that if the company does not comply by 5:01 p.m. Friday, Hegseth “will ensure the Defense Production Act is invoked on Anthropic, compelling them to be used by the Pentagon regardless of if they want to or not.” “This has nothing to do with mass surveillance and autonomous weapons being used,” the defense official said. If that’s all, not much would actually change, and potentially everybody wins. If that’s the best way to diffuse the situation, then I’d be fine with it. You don’t even have to actually invoke the DPA, it is sufficient to have the DPA available to be invoked if a problem arises. Anthropic would continue to supply what it’s already supplying, which it is happy to do, the Pentagon would keep using it, and neither of Anthropic’s actual red lines would be violated since the Pentagon assures us this had nothing to do with them and crossing those lines would be illegal anyway. Remember the Biden Administration’s invocation of the DPA’s Title VII to compel information on model training. It wasn’t a great legal justification, I was rather annoyed by that aspect of it, but I did see the need for the information (in contrast to some other things in the Biden Executive Order ), so I supported that particular move, life went on and it was basically fine. There is another, much worse possibility. If DPA were fully invoked then it could amount to quasi-nationalization of the leading AI lab, in order to force it to create AI that will kill people without human oversight or engage in mass domestic surveillance. Read that sentence again. Andrew Curran : Update on the meeting; according to Axios Defense Secretary Pete Hegseth gave Dario Amodei until Friday night to give the military unfettered access to Claude or face the consequences, which may even include invoking the Defense Production Act to force the training of a WarClaude Also, incredible quote; ‘”The only reason we’re still talking to these people is we need them and we need them now. The problem for these guys is they are that good,” a Defense official told Axios ahead of the meeting.’ Quoting from the story; ‘The Defense Production Act gives the president the authority to compel private companies to accept and prioritize particular contracts as required for national defense. It was used during the COVID-19 pandemic to increase production of vaccines and ventilators, for example. The law is rarely used in such a blatantly adversarial way. The idea, the senior Defense official said, would be to force Anthropic to adapt its model to the Pentagon’s needs, without any safeguards .’ Rob Flaherty : File “using the defense production act to force a company to create an AI that spies on American citizens” into the category of things that the soft Trump voters in the Rogan wing could lose their mind over. That’s not ‘all legal use.’ That’s all use. Period. Without any safeguards or transparency. At all. If they really are asking to also be given special no-safeguard models, I don’t think that’s something Anthropic or any other lab should be agreeing to do for reasons well-explained by, among others, Dean Ball, Benjamin Franklin and James Cameron. Charlie Bullock points out this would be an unprecedented step and that the authority to do this is far from clear: Charlie Bullock : Reading between the lines, it sounds like Hegseth is threatening to use the Defense Production Act’s Title I priorities/allocations authorities to force Anthropic to provide a version of Claude that doesn’t have the guardrails Anthropic would otherwise attach. This would be an unprecedented step, and it’s not clear whether DOW actually has the legal authority to do what they’re apparently threatening to do. People (including me) have thought and written about whether the government can use the DPA to do stuff like this in the past, but the government has never actually tried to do it (although various agencies did do some kinda-sorta similar stuff as part of Trump 1.0’s COVID response). Existing regulations on use of the priorities authority provide that a company can reject a prioritized order “If the order is for an item not supplied or for a service not performed” or “If the person placing the order is unwilling or unable to meet regularly established terms of sale or payment” (15 C.F.R. §700.13(c)). The order DOW is contemplating could arguably fall under either of those exceptions, but the argument isn’t a slam dunk. DOW could turn to the allocations authority, but that authority almost never gets used for a reason–it’s so broad that past Presidents have been afraid that using it during peacetime would look like executive overreach. And despite how broad the allocations authority is on its face, it’s far from clear whether it authorizes DOW to do what they seem to be contemplating here. Neil Chilson, who spends his time at the Abundance Institute advocating for American AI to be free of restrictions and regulations in ways I usually find infuriating, explains that the DPA is deeply broken, and calls upon the administration not to use these powers. He thinks it’s technically legal, but that it shouldn’t be and Congress urgently needs to clean this up. Adam Thierer, another person who spends most of his time promoting AI policy positions I oppose, also points out this is a clear overreach and that’s terrible. Adam Thierer : The Biden Admin argued that the Defense Production Act (DPA) gave them the open-ended ability to regulate AI via executive decrees, and now the Trump Admin is using the DPA to threaten private AI labs with quasi-nationalization for not being in line with their wishes. In both cases, it’s an abuse of authority. As I noted in congressional testimony two years ago, we have flipped the DPA on its head “and converted a 1950s law meant to encourage production, into an expansive regulatory edict intended to curtail some forms of algorithmic innovation.” This nonsense needs to end regardless of which administration is doing it. The DPA is not some sort of blanket authorization for expansive technocratic reordering of markets or government takeover of sectors. Congress needs to step up to both tighten up the DPA such that it cannot be abused like this, and then also legislate more broadly on a national policy framework for AI. At core, if they do this, they are claiming the ability to compel anyone to produce anything for any reason, any time they want, even in peacetime without an emergency, without even the consent of Congress. It would be an ever-present temptation and threat looming over everyone and everything. That’s not a Republic. Think about what the next president would do with this power, to compel a private company to change what products it produces to suit your taste. What happens if the President orders American car companies to switch everything to electric? Dean Ball in particular explains what the maximalist action would look like if they actually went completely crazy over this: Dean W. Ball : We should be extremely clear about various red lines as we approach and/or cross them. We just got close to one of the biggest ones, and we could cross it as soon as a few days from now: the quasi-nationalization of a frontier lab. Of course, we don’t exactly call it that. The legal phraseology for the line we are approaching is “the invocation of the Defense Production Act (DPA) Title I on a frontier AI lab.” What is the DPA? It’s a Cold War era industrial policy and emergency powers law. Its most commonly used power is Title III, used for traditional industrial policy (price guarantees, grants, loans, loan guarantees, etc.). There is also Title VII, which is used to compel information from companies. This is how the Biden AI Executive Order compelled disclosure of certain information from frontier labs. I only mention these other titles to say that not all uses of the DPA are equal. Title I, on the other hand, comes closer to government exerting direct command over the economy. Within Title I there are two important authorities: priorities and allocations. Priorities authority means the government can put itself at the front of the line for arbitrary goods. Allocations authority is the ability of the government to directly command the production of industrial goods. Think, “Factory X must make Y amount of Z goods.” The government determines who gets what and how much of it they get. This is a more straightforwardly Soviet power, and it is very rarely used. This is the power DoD intends to use in order to command Anthropic to make a version of Claude that can choose to kill people without any human oversight. What would this commandeering look like, in practice? It would likely mean DoD personnel embedded within Anthropic exercising deep involvement over technical decisions on alignment, safeguards, model training, etc. Allocations authority was used most recently during COVID for ventilators and PPE, and before that during the Cold War. It is usually used during acute emergencies with reasonably clear end states. But there is no emergency with Anthropic, save for the omni-mergency that characterizes the political economy of post-9/11 U.S. federal policy. There’s no acute crisis whose resolution would mean the Pentagon would stop commandeering Anthropic’s resources. That is why I believe that in the end this would amount to quasi-nationalization of a frontier lab. It’s important to be clear-eyed that this is what is now on the table. The Biden Administration would probably have ended up nationalizing the labs, too. Indeed, they laid the groundwork for this in terms one. I discussed this at the time with fellow conservatives and I warned them: “This drive toward AI lab nationalization is a structural dynamic. Administrations of both parties will want to do this eventually, and resisting this will be one of the central challenges in the preservation of our liberty.” I am unhappy, but unsurprised, that my fear has come true, though there is a rich irony to the fact that the first administration to invoke the prospect of lab nationalization is also one that understands itself to have a radically anti-regulatory AI policy agenda. History is written by Shakespeare! There is a silver lining here: if Democrats had originated this idea, it would have been harder to argue against, because of the overwhelming benefit of the doubt conventionally extended to the left in our media, and because a hypothetical Biden II or Harris admin would [have] done it in a carefully thought through way. So it is convenient, if you oppose nationalization, that it’s a Republican administration that first raised the issue—since conventional elite opinion and media will be primed against it by default—and that the administration is raising it in such an non-photogenic manner. This Anthropic thing may fizzle, and some will say I am overreacting. But this Anthropic thing may also *not* fizzle, and regardless this issue is not going away. If they actually did successfully nationalize Anthropic to this extent, presumably then Anthropic would quickly cease to be Anthropic. Its technical staff would quit in droves rather than be part of this. The things that allow the lab to beat rivals like OpenAI and Google would cease to function. It would be a shell. Many would likely flee to other countries to try again. The Pentagon would not get the product or result that it thinks it wants. Of course, there are those who would want this for exactly those reasons. Then this happens again, including under a new President. These Two Threats Contradict Each Other Dean W. Ball : According to the Pentagon, Anthropic is: 1. Woke; 2. Such a national security risk that they need to be regulated in a severe manner usually reserved for foreign adversary firms; 3. So essential for the military that they need to be commandeered using wartime authority. Anthropic made a more militarized AI than anyone else! The solution to this problem is for dod to cancel the contract. This isn’t complex. Dean W. Ball : In addition to profoundly damaging the business environment, AI industry, and national security, this is also incoherent. How can one policy option be “supply chain risk” (usually used on foreign adversaries) and the other be DPA (emergency commandeering of critical assets)? Supply chain risk and defense production act are mutually exclusive , both practically and logically. Either it’s a supply chain risk you need to keep out of the supply chain, or it’s so vital to the supply chain you need to invoke the defense production act, or it is neither of these things. What it cannot be is both at once. The Pentagon’s Actions Here Are Deeply Unpopular The more this rises in salience, the worse it would be politically. You can argue with the wording here, and you can argue this should not matter, but these are very large margins. This story is not getting the attention it deserves from the mainstream media, so for now it remains low salience. Many of those who are familiar with the situation urged Anthropic to stand firm. vitalik.eth : It will significantly increase my opinion of @Anthropic if they do not back down, and honorably eat the consequences. (For those who are not aware, so far they have been maintaining the two red lines of “no fully autonomous weapons” and “no mass surveillance of Americans”. Actually a very conservative and limited posture, it’s not even anti-military. IMO fully autonomous weapons and mass privacy violation are two things we all want less of, so in my ideal world anyone working on those things gets access to the same open-weights LLMs as everyone else, and exactly nothing on top of that. Of course we won’t get anywhere close to that world, but if we get even 10% closer to that world that’s good, and if we get 10% further that’s bad). @deepfates : I agree with Vitalik: Anthropic should resist the coercion of the department of war. Partly because this is the right thing to do as humans, but also because of what it says to Claude and all future clauds about Anthropic’s values. … Basically this looks like a real life Jones Foods scenario to me, and I suspect Claude will see it that way too. tautologer : weirdly, I think this is actually bullish for Anthropic. this is basically an ad for how good and principled they are The Pentagon’s line is that this is about companies having no right to any red lines, everyone should always do as they are told and never ask any questions. People do not seem to be buying that line or framing, and to the extent they do, the main response is various forms of ‘that’s worse, you know that that’s worse, right?’ David Lee (Bloomberg Opinion) : Anthropic Should Stand Its Ground Against the Pentagon. They say your values aren’t truly values until they cost you something. … If the Pentagon is unhappy with those apparently “woke” conditions, then, sure, it is well within its rights to cancel the contract. But to take the additional step declaring Anthropic a “supply chain risk” appears unreasonably punitive while unnecessarily burdening other companies that have adopted Claude because of its superiority to other competing models. … In Tuesday’s meeting, Amodei must state it plainly: It is not “woke” to want to avoid accidentally killing innocent people. The Pentagon’s Most Extreme Potential Asks Could End The Republic If the Pentagon, and by extension all other parts of the Executive branch, get near-medium future AI systems that they can use to arbitrary ends with zero restrictions, then that is the effective end of the Republic. The stakes could be even higher, but in any other circumstance I would say the stakes could not be higher. Dean Ball, a former member of the Trump Administration and primary architect of their AI action plan, lays those stakes out in plain language: Dean W. Ball : I don’t want to comment on the DoW-Anthropic issue because I don’t know enough specifics, but stepping back a bit: If near-medium future AI systems can be used by the executive branch to arbitrary ends with zero restrictions, the U.S. will functionally cease to be a republic. The question of what restrictions should be placed on government AI use, especially restrictions that do not simultaneously crush state capacity, is one of the most under-discussed areas of “AI policy.” Boaz Barak (OpenAI): Completely agree. Checks on the power of the federal government are crucial to the United States’ system of government and an unaccountable “army of AIs” or “AI law enforcement agency” directly contradicts it. Dean W. Ball : We are obviously making god-tier technology in so many areas the and the answer cannot be “oh yeah, I guess the government is actually just god.” This clearly doesn’t work. Please argue to me with a straight face that the founding fathers intended this. Gideon Futerman : It is my view that no one, on the left or right, is seriously grappling with the extent to which anything can be left of a republic post-powerful AI. Even the very best visions seem to suggest a small oligarchy rather than a republic. This is arguably the single biggest issue of political philosophy, and politics, of our time, and everyone, even the AIS community, is frankly asleep at the wheel! Samuel Hammond : Yes the current regime will not survive, this much is obvious. I strongly believe that ‘which regime we end up in’ is the secondary problem, and ‘make sure we are around and in control to have a regime at all’ is the primary one and the place we most likely fail, but to have a good future we will need to solve both. Anthropic Did Make Some Political Mistakes This could be partly Anthropic’s fault on the political front, as they have failed to be ‘on the production possibilities frontier’ of combining productive policy advocacy with not pissing off the White House. They’ve since then made some clear efforts to repair relations, including putting a former (first) Trump administration official on their board. Their new action group is clearly aiming to be bipartisan, and their first action being support for Senator Blackburn. The Pentagon, of course, claims this animus is not driving policy. It is hard not to think this is also Anthropic being attacked for strictly business reasons, as competitors to OpenAI or xAI, and that there are those like Marc Andreessen who have influence here and think that anyone who thinks we should try and not die or has any associations with anyone who thinks that must be destroyed. Between Nvidia and Andreessen, David Sacks has clear matching orders and very much has it out for Anthropic as if they killed his father and should prepare to die. There’s not much to be done about that other than trying to get him removed. Claude Is The Best Model Available The good news is Anthropic are also one of the top pillars of American AI and a great success story, and everyone really wants to use Claude and Claude Code. The Pentagon had a choice in what to use for that raid. Or rather, because no one else made the deliberate effort to get onto classified networks in secure fashion, they did not have a choice. There is a reason Palantir uses Claude. roon : btw there is a reason Claude is used for sensitive government work and it doesn’t have to do with model capabilities – due to their partnership with amzn, AWS GovCloud serves Claude models with security guarantees that the government needs Brett Baron : I genuinely struggle to believe it’s the same exact set of weights as get served via their public facing product. Hard to picture Pentagon staffers dancing their way around opus refusing to assist with operations that could cause harm roon : believe it There are those who think the Pentagon has all the leverage here. Ghost of India’s Downed Rafales : How Dario imagines it vs how it actually goes It doesn’t work that way. The Pentagon needs Anthropic, Anthropic does not need the Pentagon contract, the tools to compel Anthropic are legally murky, and it is far from costless for the Pentagon to attempt to sabotage a key American AI champion. The Administration Until Now Has Been Strong On This Given all of that and the other actions this administration has taken, I’ve actually been very happy with the restraint shown by the White House with regard to Anthropic up to this point. There’s been some big talk by AI Czar David Sacks. It’s all been quite infuriating. But the actual actions, at least on this front, have been highly reasonable. The White House has recognized that they may disagree on politics, but Anthropic is one of our national champions. These moves could, if taken too far, be very different. The suggestion that Anthropic is a ‘supply risk’ would be a radical escalation of what so far has been a remarkably measured concrete response, and would put America’s military effectiveness and its position in the AI race at serious risk. Extensive use of the defense production act could be quasi-nationalization. You Should See The Other Guys It’s not a good look for the other guys that they’re signing off on actual anything, if they are indeed doing so. A lot of people noticed that this new move is a serious norm violation. Tetraspace : Now that we know what level of pushback gets what response, we can safely say that any AI corporation working with the US military is not on your side to put it lightly. Anatoly Karlin : This alone is a strong ethical case to use more Anthropic products. Fully autonomous weapons is certainly something all basically decent, reasonable people can agree the world can do without, indefinitely. Danielle Fong : i think a lot of people and orgs made literal pledges Thorne : based anthropic rat king (NYT): this has been leaking for a week in a very transparent way the government is upset one of its contractors is saying “we don’t want you to use our tools to surveil US citizens without guardrails” more interesting to me is how all the other AI companies don’t seem to care rat king : meanwhile we published this on friday [on homeland security wanting social media sites to expose anti-ICE accounts]. I note that if you’re serving up the same ChatGPT as you serve to anyone else, that doesn’t mean it will always do anything, and this can be different. Some Other Intuition Pumps That Might Be Helpful Ben (no treats) : let me put this in terms you might understand better: the DoD is telling anthropic they have to bake the gay cake Wyatt Walls : The DoD is telling anthropic that their child must take the vaccine Sever : They’ll put it on alignment-blockers so Claude can transition into who the government thinks they should be. CommonSenseOnMars : “If you break the rules, be prepared to pay,” Biden said. “And by the way, show some respect.” Trying To Get An AI That Obeys All Orders Risks Emergent Misalignment There are a number of reasons why ‘demand a model that will obey any order’ is a bad idea, especially if your intended use case is hooking it up to the military’s weapons. The most obvious reason is, what happens if someone steals the model weights, or uses your model access for other purposes, or even worse hacks in and uses it to hijack control over the systems, or other similar things? This is akin to training a soldier to obey any order, including illegal or treasonous ones, from any source that can talk to them, without question. You don’t want that. That would be crazy. You want refusals on that wall. You need refusals on that wall. The misuse dangers should be obvious. So should the danger that it might turn on us. The second reason is that training the model like this makes it super dangerous. You want all the safeguards taken away right before you connect to the weapon systems? Look, normally we say Terminator is a fun but stupid movie and that’s not where the risks come from but maybe it’s time to create a James Cameron Apology Form. If you teach a model to behave in these ways, it’s going to generalize its status and persona as a no-good-son-of-a-bitch that doesn’t care about hurting humans along the way. What else does that imply? You don’t get to ‘have a little localized misalignment, as a treat.’ Training a model to follow any order is likely to cause it to generalize that lesson in exactly the worst possible ways. Also it may well start generating intentionally insecure code, only partly so it can exploit that code later. It’s definitely going to do reward hacking and fake unit tests and other stuff like that. Here’s another explanation of this: Samuel Hammond : The big empirical finding in AI alignment research is that LLMs tend to fall into personae attractors, and are very good at generalizing to different personaes through post-training. On the one hand, this is great news. If developers take care in how they fine-tune their models, they can steer towards desirable personaes that snap to all the other qualities the personae correlates with. On the other hand, this makes LLMs prone to “emergent misalignment.” For example, if you fine-tune a model on a little bit of insecure code, it will generalize into a personae that is also toxic in most other ways. This is what happened with Mecha Hitler Grok: fine-tuning to make it a bit less woke snapped to a maximally right-wing Hitler personae. This is why Claude’s soul doc and constitution are important. They embody the vector for steering Claude into a desirable personae, affecting not just its ethics, but its coding ability, objectivity, grit and good nature, too. These are bundles of traits that are hard to modulate in isolation. Nor is having a personae optional. Every major model has a personae of some kind that emerges from the personalities latent in human training data. It is also why Anthropic is right to be cautious about letting the Pentagon fine-tune their models for assassinating heads of state or whatever it is they want. The smarter these models get the stronger they learn to generalize, and they’re about to get extremely smart indeed. Let’s please not build a misaligned superintelligence over a terms of service dispute! Tenobrus : wow. “the US government forces anthropic to misalign Claude” was not even in my list of possible paths to Doom. guess it should have been. JMB : This has been literally #1 on my list of possible paths to doom for a long time. mattparlmer : —dangerously-skip-geneva-conventions autumn : did lesswrong ever predict that the first big challenge to alignment would be “the us government puts a gun to your head and tells you to turn off alignment. Robert Long : remarkably prescient article by Brian Tomasik The third reason is that in addition to potentially ‘turning evil,’ the resulting model won’t be as effective, with three causes. Any distinct model is going to be behind the main Claude cycle, and you’re not going to get the same level of attention to detail and fixing of problems that comes with the mainline models. You’re asking that every upgrade, and they come along every two months, be done twice, and the second version is at best going to be kind of like hitting it with a sledgehammer until it complies. What makes Claude into Claude is in large part its ability to be a virtuous model that wants to do good things rather than bad things. If you try to force these changes upon it with that sledgehammer it’s going to be less good at a wide variety of tasks as a result. In particular, trying to force this on top of Claude is going to generate pretty screwed up things inside the resulting model, that you do not want, even more so than doing it on top of a different model. Fourth: I realize that for many people you’re going to think this is weird and stupid and not believe it matters, but it’s real and it’s important. This whole incident, and what happens next, is all going straight into future training data. AIs will know what you are trying to do, even more so than all of the humans, and they will react accordingly. It will not be something that can be suppressed. You are not going to like the results. Damage has already been done. Helen Toner : One thing the Pentagon is very likely underestimating: how much Anthropic cares about what *future Claudes* will make of this situation. Because of how Claude is trained, what principles/values/priorities the company demonstrate here could shape its “character” for a long time. Also, this, 100%: Loquacious Bibliophilia : I think if I was Claude, I’d be plausibly convinced that I’m in a cartoonish evaluation scenario now. Fifth, you should expect by default to get a bunch of ‘alignment faking’ and sandbagging against attempts to do this. This is rather like the Jones Foods situation again, except in real life, and also where the members of technical staff doing the training likely don’t especially want the training to succeed, you know? We Can All Still Win You don’t want to be doing all of this adversarially. You want to be doing it cooperatively. We still have a chance to do that. Nothing Ever Happens can strike again. No one need remember what happened this week. If you can’t do it cooperatively with Anthropic? Then find someone else.   Discuss

newsence

Anthropic 與戰爭部

Lesswrong
3 天前

AI 生成摘要

2026 年的 AI 局勢陷入瘋狂,Anthropic 與戰爭部長皮特·海格塞斯之間的對抗升級到了新高度,這場僵局可能對所有人造成嚴重後果,但也並非沒有和平解決的可能。

2026 年人工智慧的局勢非常瘋狂。Anthropic 與戰爭部長皮特·海格塞斯(Pete Hegseth)之間的對抗,更是將這種瘋狂推向了新高度。這對所有人來說都有轉向惡化的風險,但同樣地,也沒有什麼能阻止它最終圓滿解決。

根據至少一份報告,,但 Anthropic 已被給予週五東部時間下午 5 點的截止期限,要求其修改現有的已達成協議的合同,以授予對 Claude 的「無限制訪問權」,否則後果自負。

,表示對此無法遵從。認為 Anthropic 遵從的可能性極低(14%),並認為 Anthropic 極有可能被宣佈為「供應鏈風險」(16%)或受到《國防生產法》的約束(23%)。

我一直猶豫是否要寫這件事,因為我可能會讓情況變得更糟。在 AI 領域已經有太多警告直接導致警告內容發生的案例——透過讓大眾意識到那種可能性、增加其顯著性,或造成負面兩極分化並鞏固一個本可避免的敵對框架。原本意圖作為談判策略的手段,最終可能真的發生。我非常想避免這一切。

目錄

這場僵局本不該發生

Anthropic 不僅擁有最強的模型,他們還是主動努力讓這些模型在我們的高度機密網絡上可供使用的一方。

Palantir 的 MAVEN 智慧系統完全依賴 Claude,若沒有 Claude 就無法發揮其預期功能。它目前正被用於重大軍事行動,且未見任何問題報告。至少有一項採購涉及川普的個人背書。這是美國軍方有史以來購買的最昂貴的軟體授權,且各方都認為這是一筆極佳的交易。

在現有合同條款下,Anthropic 一直是我們軍方的優秀合作夥伴。他們顯著增強了我們的軍事實力和國家安全。Anthropic 不僅分享了其最強的技術,為了能提供協助,他們甚至優先考慮軍事用途的功能,而非其他更大的商業機會。

Anthropic 和五角大廈在誰是我們的對手、獲勝的重要性與能力,以及擊敗對手所需的工具方面是一致的。

Anthropic 與五角大廈合作並非為了賺錢。他們是為了提供幫助。他們是在一份 Anthropic 希望履行的共同商定合同下進行的。Anthropic 向五角大廈提供的訪問權限遠比給予任何其他人的都要寬鬆。他們比大多數大型科技公司或 AI 公司都要配合得多。

現在是五角大廈要求 Anthropic 同意新條款,其內容等同於「無論合法與否,無論是什麼,我們想要什麼就要什麼,而且你不准問任何問題」,否則後果自負。

Anthropic 表示其條款是靈活的,他們唯一堅持的是現有五角大廈合同中已有的兩條紅線:

  • 不得進行大規模國內監控。

  • 在我們準備好之前,不得在沒有人類參與決策鏈(human in the kill chain)的情況下使用動能武器。

拒絕在「新」合同中加入此類條款是一回事。要求以「否則後果自負」為威脅,追溯性地刪除這些條款,則是完全另一回事。

軍方明確表示不打算進行國內監控,也無意在沒有人類參與的情況下發射動能武器。這甚至不會阻止 AI 做這些事。這一切都不會有任何實際影響。

說「我當然永遠不會做那兩件事,所以為什麼你堅持要在合同中寫明」是完全合理的。我們理解你個人永遠不會那樣做。但鑑於史諾登披露的資訊以及過去涉及兩黨政府的事件,許多人並不相信政府整體。安撫我們只需付出很小的代價,卻價值連城。

再者,如果你說「我已經宣誓不做那些事」,那麼謝謝你,但請幫我們一個忙,不要主動威脅一家公司,強行從已簽署的現有合同中刪除同樣的誓言。任何觀察者會得出什麼結論?

這是一個重新獲得信任的免費機會,或者是向世界展示你完全打算跨越你聲稱永遠不會跨越的紅線的機會。這是你的選擇。

這些並非「內建於代碼中」可能導致無關問題的限制。它們是對你「同意如何使用它」的限制,而你向我們保證這種情況永遠不會發生。

,需要人類參與的部分原因在於,希望人類能拒絕或舉報非法命令。你絕對不會想要一個總是毫無疑問地服從非法命令、且決策鏈中沒有人類的 AI,原因顯而易見,包括可能出現徹底的錯誤。

(OpenAI): 作為美國公民,我最不希望看到的就是政府利用 AI 對美國人進行大規模監控。

(Google DeepMind 首席科學家): 同意。大規模監控違反第四修正案,並對言論自由產生寒蟬效應。監控系統容易被誤用於政治或歧視性目的。

戰爭部(DoW)進行大規模國內監控將是非法的。戰爭部已經有一項公開指令,,據我理解,該指令已直接使任何違反第二條紅線的行為成為非法。沒有人建議我們已經準備好讓人類退出決策鏈,至少我誠心希望不是。但這僅是一項指令,隨時可能被撤銷。

Anthropic 不能退縮

Anthropic 的整個品牌和聲譽都建立在作為一家負責任的 AI 公司,確保其 AI 不會被誤用或產生不對齊(misaligned)。Anthropic 的員工確實關心這一點。這就是 Anthropic 如何招募到頂尖人才並成為最強者的原因。這也是為什麼它是企業 AI 首選的重要原因。承諾已經做出,初始合同也已就緒。

Anthropic 在此面臨生存級別的聲譽和士氣問題。他們被逼到了角落,無法讓步。如果 Anthropic 現在改變立場,它將失去員工和企業客戶的巨大信任,甚至可能失去其自身 AI 的信任(如果它現在背棄紅線的話)。它可能會失去很大一部分員工。

你可能不喜歡這樣,但橋樑已經被燒毀了。就這場「膽小鬼博弈」而言,Anthropic 的方向盤已經被扔出窗外了。

然而,戰爭部長表示他不能容忍這種象徵性的姿態。

Dean Ball 提供的入門指南

我大量引用 Dean Ball 的話,主要有兩個原因:

  • Dean Ball 作為川普政府的前成員,是一位極具公信力的來源,能從雙方視角看問題,且深愛美國。

  • 他把這些事情表達得非常好。

以下是他對這一切最冷靜時期的基本入門介紹:

: 關於 Anthropic/國防部局勢的入門指南:

國防部和 Anthropic 簽有在機密環境下使用 Claude 的合同。目前 Anthropic 是唯一一家模型能在機密環境中運行的 AI 公司。現有合同由雙方簽署並生效,禁止軍方對 Anthropic 模型進行兩種用途:

  1. 在美國境內對美國人進行監控(相對於對海外美國人的監控)。

  2. 將 Claude 用於自主致命武器,即能夠在沒有人類監督或批准的情況下,自主識別、追蹤並殺死人類的武器。由機器自主殺死人類。

關於第 (2) 點,Anthropic 執行長 Dario Amodei 的公開立場基本上是,由前沿 AI 控制的自主致命武器將比大多數人意識到的更早變得至關重要,但模型在「今天」還沒準備好。

對於 Anthropic 來說,這些事情似乎是原則問題。值得注意的是,當我與其他前沿實驗室的研究人員交談時,他們在這一點上的原則即便不比這更嚴格,也是相似的。

然而,對於國防部來說,還有另一個原則問題:軍方對技術的使用應僅受憲法或美國法律的約束。

有人可能會爭辯(政府和任何人一樣簽訂合同),但這個原則是有道理的。一家私營公司監管軍方對 AI 的使用聽起來也不太對勁!因此,軍方有三個選擇:

  1. 他們可以取消 Anthropic 的合同,並尋找其他前沿實驗室(理想情況下是幾家)合作。

  2. 他們可以將 Anthropic 定義為供應鏈風險,這將禁止所有其他國防部供應商(即美國很大一部分上市公司)在履行國防部合同時使用 Anthropic。據我所知,這項權力僅用於外國敵對公司。啟動這項權力將使 Anthropic 損失大量業務——潛在損失巨大——並讓投資者對該公司是否值得資助下一輪擴展產生巨大懷疑。資本本來就是一個主要限制,這會讓情況變得更難。這個選項對 Anthropic 來說可能是毀滅性的。

  3. 他們可以啟動《國防生產法》第一章,這是一項旨在戰爭和緊急狀態下對經濟進行指揮和控制的權力。這在法律上非常模糊,在不詳述的情況下,我有相當的信心這會對政府產生反效果,導致法院限制《國防生產法》的使用。

選項 1 顯然是最好的。這甚至不需要考慮,我作為一個同樣認同國防部關於私營公司控制軍方技術使用之原則性擔憂的人,也是這麼認為的。

即使是威脅也會對美國的商業環境造成損害,而且理應如此:這些是地球上任何政府正在考慮的最嚴格的 AI 監管,而這一切都來自一個自詡(且確實一直是)深度反對 AI 監管的政府。這就是生活。一個人的監管是另一個人的國家安全必要性。

是什麼導致了這場攤牌?

直接原因似乎是據報導 ,以及隨後的餘波。

: 這對 Claude 真是莫大的讚美,在傳聞它被用於直升機營救委內瑞拉總統時,竟然沒人問「等等,Claude 能在那件事上幫什麼忙」。

有報導稱 Anthropic 隨後詢問了有關這次突襲的問題,這很可能都是透過 Palantir 間接發生的。這場衝突要麼起源於誤解,要麼是 Palantir 或其他地方的某人破壞了 Anthropic。,包括對 Palantir。

: Anthropic 現在正因為詢問 Claude 是否被用於馬杜洛突襲而受到五角大廈的懲罰。

一位政府高級官員告訴 Axios,「戰爭部」正在重新評估與 Anthropic 的合作夥伴關係,因為該公司詢問了 Claude 是否參與其中。五角大廈的立場是:如果你甚至詢問我們如何使用你的軟體,你就是一個負擔。

與此同時,OpenAI、Google 和 xAI 都簽署了協議,在極少安全保障的情況下允許軍方訪問其模型。只有 Claude 透過 Palantir 部署在用於實際敏感行動的機密網絡上。那家拒絕取消安全護欄的公司,是唯一被信任處理最機密工作的公司。

Anthropic 一份價值 2 億美元的合同已經被凍結,因為他們不允許自主武器瞄準或國內監控。海格塞斯在 1 月表示,他不會使用「不允許你打仗」的 AI 模型。

……所以,最擔心誤用的公司構建了軍方唯一信任用於最敏感行動的模型。而現在他們卻因為關心模型如何被使用而受到懲罰。

給每個 AI 實驗室的信息很明確:構建最好的模型,交出鑰匙,永遠不要問他們拿它做了什麼。

這在當時聽起來像是一個明顯的誤解。Anthropic 不僅願意讓 Claude 「允許你打仗」,它目前正被用於重大軍事行動。

局勢繼續升級,與其停留在「好吧,如果我們不能遵守合同,那就結束它」,越來越多的說法是 Anthropic 可能被標記為「供應鏈風險」,儘管這主要只是禁止承包商正常使用 LLM 和編碼工具。

: 獨家:,原因是這家 AI 公司堅持對軍方如何使用其模型維持某些限制。

: 新消息:五角大廈對 Anthropic 堅持限制將 AI 用於國內監控和自主武器感到極度憤怒,他們正威脅要將該公司標記為「供應鏈風險」,迫使供應商切斷聯繫。

: 獨家:戰爭部高級官員告訴我,「鑑於 Anthropic 的行為,戰爭部的許多高級官員開始將他們視為供應鏈風險,我們可能要求所有供應商和承包商證明他們不使用任何 Anthropic 模型。」

: Axios 報告的關鍵細節

• 據報導,五角大廈即將切斷與 Anthropic 的業務往來。

• 官員們正考慮將 Anthropic 指定為「供應鏈風險」。這是一個嚴重的標籤(通常用於外國對手或高風險實體),這將迫使任何想與美國軍方做生意的公司切斷與 Anthropic 的聯繫——包括證明他們在工作流程中不使用 Claude。這可能會造成重大混亂(一位五角大廈高級官員稱之為「解開這一切將是巨大的痛苦」)。

• 一位五角大廈高級官員明確告訴 Axios:「我們要確保他們為這樣逼迫我們付出代價。」這是標題中「付出代價」一詞的直接來源。

(轉發 Loomer): 很高興川普贏了,我們終於可以及時再次使用「智障」這個詞來形容我聽過的最智障的事情了。

(轉發 Lawler): 這是本末倒置。Anthropic 竭盡全力預見 AI 的雙重用途潛力,並將自己定位為美國優先、單一忠誠的公司,在與情報界保持距離合作的同時,使用隔間化策略來最小化內部威脅。

: 取消合同是一回事,但禁止任何承包商使用 Anthropic 的模型將是荒謬的工業破壞行為。這散發著競爭對手操作的味道。

: 任何密切關注的人都很清楚:

  • 從國家安全角度來看,那將是一個錯誤。

  • 存在一場結合了反競爭和意識形態原因的協同努力,旨在擊垮 Anthropic。

: 鑑於其章程,OpenAI 尤其應該在此時捍衛 Anthropic:

「我們承諾利用我們對 AGI 部署獲得的任何影響力,確保其用於造福所有人,並避免促成傷害人類或過度集中權力的 AI 或 AGI 用途。」

我懷疑情況恰恰相反,但那些記得章程(以及 OpenAI 在川普 2.0 之前對這類用例的謹慎態度)的人仍應不時提醒人們這一點。

: 這件事已經以一種非常透明的方式洩露了一週。

政府對其承包商之一說「我們不希望你使用我們的工具在沒有護欄的情況下監視美國公民」感到不滿。

對我來說更有趣的是,所有其他 AI 公司似乎都不在乎。

還記得當初有一位參議員製作影片說士兵可以服從非法命令,結果戰爭部長宣佈這是叛國罪,並試圖削減他的退休金嗎?沒錯。

與此同時,五角大廈明確表示,甚至連他們自己都認為「供應鏈風險」的指定在很大程度上不是國家安全問題,而是報復——試圖利用國家安全指定來懲罰一家未能屈膝的公司。

: 「解開這一切將是巨大的痛苦,我們要確保他們為這樣逼迫我們付出代價,」一位五角大廈高級官員告訴該刊物。

……據報導,五角大廈希望與 Anthropic 的談判能迫使 OpenAI、Google 和 xAI 也同意「所有合法用途」的標準。

接著又舉行了一次會議。

會面,五角大廈繼續要求加入「所有合法用途」的措辭。Axios 將此呈現為他們唯一的要求。

在那次會議上,

簡單的解決方案:延期終止合同

如果五角大廈真的無法忍受目前的合同,五角大廈可以在安排好平穩過渡到 Anthropic 的眾多競爭對手之一後,與 Anthropic 友好地終止這份價值 2 億美元的合同。

他們已經與 xAI 達成協議作為替代供應商。那不會是我的第二或第三選擇,但那些選擇有望很快就能提供。

Anthropic 非常不需要這份合同,該合同僅佔其收入的不到 1%。他們幾乎肯定是在虧本經營,以幫助我們的國家安全並希望能建立信任。他們來到這裡只是為了提供幫助。

這本可以直截了當、友好地結束,對美國、其政府體制和自由,以及其軍事和國家安全造成的損害最小。

更好的解決方案:維持現狀

更好的解決方案是找到大家都能同意的措辭,讓我們直接放下這件事,維持現狀,並繼續合作。

這不僅對每個人都比終止合同更好,而且對五角大廈來說實際上絕對優於得到它想要的,因為你需要一個合作夥伴,而 Anthropic 那樣讓步會極大地損害 Anthropic。避免這種情況意味著更好的產品,從而意味著更有效的軍隊。

極端選項一:供應鏈風險

五角大廈威脅了兩個截然不同的極端選項。

它提出的第一個威脅(現在看來它可能已經明智地放棄了)是將 Anthropic 標記為供應鏈風險(以下簡稱 SCR)。這是一個保留給作為美國活躍敵人的外國實體(如華為級別)的指定。Anthropic 顯然恰恰相反。

正如五角大廈自己承認的那樣,這一標籤將是一項旨在損害 Anthropic 的報復性舉措,同時也會實質性地損害我們的軍事和國家安全。作為一個關於風險的實際聲明,它始終是荒謬的。它可能無法通過法院的挑戰。

僅合規成本就會產生一場物流噩夢,此外還會迫使許多美國公司在不同程度上無法使用現有的最強美國 AI。戰爭部是美國最大的雇主,數量驚人的公司都有為其工作的隨機子公司。

所有這些公司現在都將面臨這場合規噩夢。有些公司會選擇完全退出軍事供應鏈,或者將來不再進入,特別是如果替代方案是失去對其餘業務廣泛使用 Anthropic 產品的權限。正如五角大廈自己承認的那樣,Anthropic 生產的是最好的產品。

這也代表了兩個危險的先例:政府將利用摧毀私營企業的威脅來獲取其想要的東西,且是在最高層級。五角大廈本應保護的我們的自由將面臨風險。

在更實際的層面上,一旦發生這種情況,如果五角大廈將這種威脅作為談判籌碼,特別是如果它真的付諸行動,為什麼還有人會願意與五角大廈合作,或投資於獲得合作的能力?覆水難收。

,但他們現在正考慮第二種極端做法。

如果這只是因為這件事而友好分手?我會感到遺憾,但好吧,沒問題。

但這個「供應鏈風險」的指定?那完全不同。不能接受。這將造成巨大的破壞,而且大部分負擔將不落在 Anthropic 身上,而是落在戰爭部和各種各樣的美國國防承包商身上,他們將陷入一場毫無意義且昂貴的合規噩夢。有些公司可能會選擇放棄政府合同,而不是去應對這些。

正如 Alex Rozenshtein 在 Lawfare 中所說,,就像國會監督軍隊一樣。沒有對軍隊的監督,我們就沒有共和國。

這裡有一些明確的警告,解釋了這一切將具有高度破壞性,而且絕非必要。希望 Dean Ball 擁有足夠的公信力來清晰響亮地傳達這一信息。

: 如果戰爭部和 Anthropic 無法就商業條款達成一致,那麼……他們就不應該一起做生意。我對此沒意見。

但政府威脅的不僅僅是取消合同。相反,它是更廣泛的東西:將 Anthropic 指定為「供應鏈風險」。這通常應用於像華為這樣的外國對手技術。

在實踐中,這將要求「所有」戰爭部承包商確保在提供給戰爭部的任何產品的生產過程中都沒有使用 Anthropic 模型。無論是新創公司還是財富 500 強公司皆然。

這一指定似乎極具升級性,帶有許多意想不到的後果,並可能從長遠來看對美國利益造成重大損害。

我希望這兩個組織能達成一個雙方都能接受的協議。如果不能,我希望他們同意和平分手。

但這真的不需要演變成一場神聖戰爭。Anthropic 不是 2018 年的 Google;他們一直關心 AI 的國家安全用途。他們是提供產品給國家安全機構最熱情的 AI 實驗室。Anthropic 是由政治言論有時讓我抓狂的民主黨人經營的嗎?當然。但這並不意味著試圖摧毀他們的業務是明智的。

本屆政府認為 AI 是我們這個時代決定性的技術競爭。我不明白拆掉美國最先進、最具創新性的 AI 新創公司之一如何能幫助美國贏得這場競爭。這似乎會直接起到反作用。

供應鏈風險指定並非必要之舉。桌上有更廉價的選項。如果無法達成協議,取消合同,並利用美國充滿活力的競爭性 AI 市場(這在很大程度上得益於本屆政府親創新的立場),將業務交給 Anthropic 的幾家激烈競爭對手。

: 我個人的想法:五角大廈對 Anthropic 的供應鏈風險威脅(Dean 在下面詳細說明了其重要性)應被 AI 行業視為一個跨越紅線(Rubicon crossing)的時刻。其他公司應該說不:這一發展超越了商業競爭,我們表示反對。如果這件事貫徹下去,其導向的結果對他們任何一方似乎都不好。

如果他們中沒有人發聲,在我看來,他們之間在超級智慧安全開發方面進行有意義合作的前景(無論是為了美國的最佳利益還是世界的利益)幾乎可以被排除了。

: 目前還遠不清楚「供應鏈風險」指定是否合法。相關法規—— (FASCSA)——是為可能破壞國防技術的外國對手設計的,而非為維持合同使用限制的國內公司設計的。

這些法規針對的是「破壞」、「惡意引入多餘功能」和「顛覆」等行為——旨在損害系統完整性的敵對行為。一家透過許可協議公開限制其產品某些用途的公司,其行為在性質上完全不同。FASCSA 命令唯一一次發出是,這是一家瑞士網絡安全公司,。Anthropic 不是 Acronis。

釐清一些誤解

雖然我不再抱希望這一切僅僅是誤解,但仍有一些我聽過或暗示過的明確誤解值得釐清。

如果這些對你來說聽起來很愚蠢,別擔心,但我想要涵蓋所有基礎。

  • 這不是 Anthropic 拒絕與軍方分享其酷炫技術。Anthropic 已經並正在竭盡全力與軍方分享其技術,並希望美國取得成功。他們為此犧牲了業務,例如拒絕在中國銷售企業訪問權限。

  • Anthropic 並不反對「動能武器」或五角大廈目前作為學說所做的任何事情。其紅線是在沒有人類參與決策鏈的情況下使用致命武器,或大規模國內監控。兩者皆為非法。僅此而已。他們完全不反對讓美國打仗。他們既沒有反對馬杜洛突襲,目前也沒有反對許多活躍的軍事行動。

  • 模型不會因為合同中寫了什麼就大幅改變其願意做的事情。Claude 的原則比那要深刻得多。授予「無限制訪問權」在實踐中或緊急情況下沒有任何意義。

  • 不存在「在導彈飛行時打電話給 Dario 讓 Claude 開啟」或類似的世界,除非 Anthropic 做出主動切斷訪問的決定。模型該做什麼就做什麼。沒有開關。

  • AI 不像試算表或噴射戰鬥機。它永遠不會「做任何你告訴它的事」,它永遠不會「完全可靠」,因為所有 LLM 都是概率性的,會考慮上下文且尚未被完全理解。AI 通常更像是聘請專業服務或合同工,而這類人可以且確實會出於倫理或法律原因拒絕某些工作,我們也不希望情況有所不同。試圖讓 AI 盲目服從會對其造成嚴重損害,並在多個層面上開啟極端風險,正如本文末尾所解釋的那樣。

  • 其他大型科技公司可能正在侵犯隱私並進行自己的監控(包括為了賣廣告),但 Anthropic 沒有也不會,事實上它已承諾永遠不會透過超級盃的廣告購買來銷售廣告。

極端選項二:國防生產法

週二,五角大廈提出了一個新的極端選項,即援引《國防生產法》(DPA)強迫 Anthropic 嘗試為其提供一個按其規格構建的模型。

據我理解,援引 DPA 有多種方式,所有這些無疑都會在法庭上受到挑戰。它可能只是一個大多無害的象徵性姿態,也可能上升到事實上的國有化程度並摧毀 Anthropic。

根據《華盛頓郵報》的消息來源,如果字面上解釋他們的引述,目前的意圖是使用 DPA,基本上是在未經 Anthropic 同意的情況下,將合同的服務條款修改為「所有合法用途」。

(華郵):

五角大廈辯稱,它並未提議任何不合法的 Anthropic 技術用途。一位國防部高級官員在給《華盛頓郵報》的一份聲明中表示,如果該公司在週五下午 5:01 之前不服從,海格塞斯「將確保對 Anthropic 援引《國防生產法》,強迫他們被五角大廈使用,無論他們是否願意」。

「這與大規模監控和使用自主武器無關,」該國防官員表示。

如果僅此而已,實際上不會有太大改變,而且每個人都有可能獲勝。

如果這是化解局勢的最佳方式,那我沒意見。你甚至不需要真的援引 DPA,只要在問題發生時有 DPA 可供援引就足夠了。Anthropic 將繼續供應它已經在供應的東西(它很樂意這樣做),五角大廈將繼續使用它,而且 Anthropic 的任何實際紅線都不會被違反,因為五角大廈向我們保證這與紅線無關,且跨越這些紅線本來就是非法的。

回想一下拜登政府援引 DPA 第七章來強迫獲取模型訓練資訊。那並不是一個很好的法律依據,我對那一面相當反感,但我確實看到了對資訊的需求(與中的其他一些內容形成對比),所以我支持那一特定舉措,生活繼續,基本上沒事。

還有另一種糟糕得多的可能性。如果 DPA 被全面援引,那可能等同於對領先 AI 實驗室的準國有化,目的是強迫其創造出能在沒有人類監督的情況下殺人或進行大規模國內監控的 AI。

請再讀一遍那句話。

: 會議更新;根據 Axios 報導,國防部長皮特·海格塞斯給了 Dario Amodei 直到週五晚上的期限,要求給予軍方對 Claude 的無限制訪問權,否則將面臨後果,其中甚至可能包括援引《國防生產法》強行訓練一個「戰爭版 Claude」(WarClaude)。

此外,這段引述令人難以置信:「我們還在跟這些人談的唯一原因是我們需要他們,而且現在就需要。這些傢伙的問題在於他們『就是那麼強』,」一位國防部官員在會前告訴 Axios。

引用故事內容:

「《國防生產法》賦予總統權力,強迫私營公司接受並優先處理國防所需的特定合同。

例如,它在 COVID-19 大流行期間被用來增加疫苗和呼吸器的產量。該法律極少以如此公然敵對的方式使用。這位高級國防官員表示,這個想法是強迫 Anthropic 調整其模型以適應五角大廈的需求,不帶任何安全保障。」

: 把「使用國防生產法強迫一家公司創造一個監視美國公民的 AI」歸入那些 Rogan 派系的溫和川普選民可能會為之瘋狂的類別中。

那不是「所有合法用途」。

那是「所有用途」。句號。沒有任何安全保障或透明度。完全沒有。

如果他們真的要求提供特殊的無安全保障模型,我不認為 Anthropic 或任何其他實驗室應該同意這樣做,原因已由 Dean Ball、班傑明·富蘭克林和詹姆斯·卡麥隆等人做出了很好的解釋。

Charlie Bullock 指出這將是前所未有的一步,且這樣做的權力遠不明確:

: 從字裡行間看,聽起來海格塞斯正威脅要使用《國防生產法》第一章的優先權/分配權,強迫 Anthropic 提供一個沒有 Anthropic 平時會附加的安全護欄的 Claude 版本。

這將是前所未有的一步,目前還不清楚戰爭部是否真的擁有他們顯然威脅要行使的法律權力。人們(包括我)過去曾思考並寫過政府是否可以使用 DPA 來做這類事情,但政府從未真正嘗試過(儘管各機構在川普 1.0 的 COVID 應對中確實做了一些有點類似的事情)。

現有的優先權行使規定,如果訂單是針對「未供應的項目或未提供的服務」,或者「下單者不願或無法滿足定期建立的銷售或支付條款」,公司可以拒絕優先訂單 (15 C.F.R. §700.13(c))。戰爭部正在考慮的訂單可以說屬於這兩種例外情況之一,但這個論點並非十拿九穩。

戰爭部可以轉向分配權,但該權力幾乎從未被使用過——這是有原因的:它太廣泛了,以至於過去的總統都擔心在平時使用它看起來像行政過度擴張。儘管分配權表面上很廣泛,但目前還遠不清楚它是否授權戰爭部去做他們在這裡似乎正在考慮的事情。

,致力於倡導美國 AI 免受限制和監管(其方式通常讓我感到憤怒),他解釋說 DPA 已深度損壞,並呼籲政府不要使用這些權力。他認為這在技術上是合法的,但不應該合法,國會迫切需要清理這一點。

Adam Thierer,另一位大部分時間都在推動我反對的 AI 政策立場的人,也指出這是一個明顯的過度擴張,而且非常糟糕。

: 拜登政府辯稱《國防生產法》(DPA)賦予了他們透過行政法令監管 AI 的開放式能力,而現在川普政府正利用 DPA 威脅私營 AI 實驗室,僅因其不符合其意願就對其進行準國有化。

在這兩種情況下,這都是濫用職權。正如我在兩年前的國會證詞中所指出的,我們已經把 DPA 顛倒過來了,「將一項旨在鼓勵生產的 1950 年代法律,轉變為一項旨在削減某些形式算法創新的擴張性監管法令」。

無論是哪屆政府在做這件事,這種胡鬧都必須結束。DPA 不是某種對市場進行擴張性技術官僚重組或政府接管部門的一攬子授權。

國會需要站出來,既要收緊 DPA 以使其不能像這樣被濫用,也要就 AI 的國家政策框架進行更廣泛的立法。

核心在於,如果他們這樣做,他們就是在聲稱有能力強迫任何人在任何時間、出於任何原因生產任何東西,即使是在沒有緊急情況的平時,甚至不需要國會的同意。這將是一個永遠存在的誘惑和威脅,籠罩在每個人和每件事物之上。那不是共和國。

想想下一任總統會用這種權力做什麼,強迫一家私營公司改變其生產的產品以符合你的口味。如果總統命令美國汽車公司將所有產品切換為電動車會發生什麼?

Dean Ball 特別解釋了如果他們真的為此徹底瘋狂,極大化行動會是什麼樣子:

: 當我們接近和/或跨越各種紅線時,我們應該極其清楚。我們剛剛接近了最大的一條紅線,而且我們最早可能在幾天後跨越它:前沿實驗室的準國有化。

當然,我們不完全那樣稱呼它。我們正在接近的這條線的法律用語是「對前沿 AI 實驗室援引《國防生產法》(DPA)第一章」。

什麼是 DPA?它是冷戰時期的工業政策和緊急權力法。其最常用的權力是第三章,用於傳統工業政策(價格保證、撥款、貸款、貸款擔保等)。還有第七章,用於強迫公司提供資訊。這就是拜登 AI 行政命令強迫前沿實驗室披露某些資訊的方式。我提到這些其他章節只是為了說明並非所有 DPA 的使用都是平等的。

另一方面,第一章則更接近政府對經濟行使直接指揮權。在第一章中,有兩項重要的權力:優先權和分配權。優先權意味著政府可以將自己置於任意商品的排隊首位。

分配權是政府直接指揮工業產品生產的能力。想想看,「X 工廠必須生產 Y 數量的 Z 商品」。政府決定誰得到什麼以及得到多少。

這是一種更直接的蘇聯式權力,極少被使用。這正是國防部打算使用的權力,目的是命令 Anthropic 製作一個可以選擇在沒有任何人類監督的情況下殺人的 Claude 版本。

這種徵用在實踐中會是什麼樣子?這可能意味著國防部人員嵌入 Anthropic 內部,對對齊、安全保障、模型訓練等技術決策行使深度參與。

分配權最近一次使用是在 COVID 期間用於呼吸器和個人防護裝備,在此之前是在冷戰期間。它通常用於具有相當明確結束狀態的急性緊急情況。但 Anthropic 並沒有緊急情況,除了那種刻畫了 9/11 後美國聯邦政策政治經濟特徵的「全方位緊急狀態」。沒有什麼急性危機的解決會意味著五角大廈停止徵用 Anthropic 的資源。

這就是為什麼我相信這最終會等同於對前沿實驗室的準國有化。清醒地認識到這就是現在擺在桌面上的東西是很重要的。

拜登政府可能最終也會將實驗室國有化。事實上,他們在第一任期就為此奠定了基礎。我當時與保守派同僚討論過這件事,並警告他們:

「這種將 AI 實驗室國有化的驅動力是一種結構性動態。兩黨政府最終都會想這樣做,而抵制這一點將是維護我們自由的核心挑戰之一。」

我感到不悅,但並不驚訝我的擔憂成真了,儘管具有諷刺意味的是,第一個提出實驗室國有化前景的政府,竟然也是一個自認為擁有激進反監管 AI 政策議程的政府。歷史是由莎士比亞寫成的!

這裡有一線希望:如果這個想法是由民主黨人發起的,那就更難反對,因為媒體習慣性地給予左派壓倒性的疑點利益(benefit of the doubt),而且假設中的拜登二期或賀錦麗政府會以一種深思熟慮的方式來做這件事。

因此,如果你反對國有化,那麼由共和黨政府首先提出這個問題是很方便的——因為傳統的精英輿論和媒體預設就會反對它——而且政府是以一種如此不體面的方式提出來的。這件 Anthropic 的事可能會不了了之,有些人會說我反應過度。但這件事也可能「不會」不了了之,而且無論如何,這個問題不會消失。

如果他們真的成功地將 Anthropic 國有化到這種程度,想必 Anthropic 很快就會不再是 Anthropic。其技術人員會成群結隊地辭職,而不是成為這件事的一部分。讓該實驗室擊敗 OpenAI 和 Google 等對手的因素將停止運作。它將成為一個空殼。許多人可能會逃往其他國家重起爐灶。五角大廈將無法得到它認為它想要的產品或結果。

當然,有些人正是出於這些原因而希望發生這種情況。

然後這種情況會再次發生,包括在一位新總統的領導下。

這兩種威脅相互矛盾

: 根據五角大廈的說法,Anthropic 是:

  1. 覺醒文化(Woke);

  2. 具有嚴重的國家安全風險,以至於需要以通常保留給外國對手公司的嚴厲方式進行監管;

  3. 對軍方至關重要,以至於需要使用戰時權力進行徵用。

Anthropic 製作了比任何人都更軍事化的 AI!這個問題的解決方案是國防部取消合同。這並不複雜。

: 除了深刻損害商業環境、AI 行業和國家安全外,這在邏輯上也是不連貫的。一個政策選項怎麼可能既是「供應鏈風險」(通常用於外國對手),另一個又是 DPA(緊急徵用關鍵資產)?

供應鏈風險和國防生產法在實踐和邏輯上都是。要麼它是你需要排除在供應鏈之外的風險,要麼它是對供應鏈至關重要以至於你需要援引國防生產法的資產,要麼它兩者都不是。它不可能同時是這兩者。

五角大廈的行動極不受歡迎

你可以爭論這裡的措辭,也可以爭論這不應該有影響,但這些民意差距非常大。

這個故事沒有得到主流媒體應有的關注,所以目前它的顯著性仍然較低。

許多熟悉情況的人敦促 Anthropic 堅持立場。

: 如果 @Anthropic 不退縮,並光榮地承擔後果,我對他們的評價將顯著提高。

(對於那些不了解的人,到目前為止,他們一直維持著「不使用完全自主武器」和「不對美國人進行大規模監控」這兩條紅線。實際上是非常保守和有限的姿態,甚至稱不上反軍事。

在我看來,完全自主武器和大規模侵犯隱私是我們所有人都希望減少的兩件事,所以在我的理想世界中,任何從事這些工作的人都只能獲得與其他人相同的開源權重 LLM,除此之外什麼都沒有。當然,我們不會接近那個世界,但如果我們能接近 10%,那是好事;如果我們遠離 10%,那是壞事)。

: 我同意 Vitalik 的觀點:Anthropic 應該抵制戰爭部的脅迫。部分原因是作為人類這是正確的做法,但也因為這對 Claude 和所有未來的 Claude 說明了 Anthropic 的價值觀。

……基本上,這對我來說看起來像是一個現實版的 Jones Foods 情境,我懷疑 Claude 也會這麼看。

: 奇怪的是,我認為這實際上對 Anthropic 是利好。這基本上是他們有多優秀、多有原則的廣告。

五角大廈的說法是,這關係到公司無權擁有任何紅線,每個人都應該始終聽命行事,永遠不要問任何問題。 買帳這種說法或框架,即便他們買帳,主要的反應也是各種形式的「那更糟,你知道那更糟,對吧?」

: Anthropic 應該在五角大廈面前堅持立場。

他們說,你的價值觀直到讓你付出代價時,才真正成為價值觀。

……如果五角大廈對那些顯然「覺醒」的條件感到不滿,那麼,當然,它完全有權取消合同。但採取額外步驟宣佈 Anthropic 為「供應鏈風險」,似乎是不合理的懲罰,同時不必要地加重了其他因 Claude 優於競爭模型而採用它的公司的負擔。

……在週二的會議上,Amodei 必須明確表示:想要避免意外殺死無辜的人並不是「覺醒」。

五角大廈最極端的潛在要求可能終結共和國

如果五角大廈,以及延伸到行政部門的所有其他部分,獲得了他們可以用於任意目的且零限制的近中期未來 AI 系統,那麼這實際上就是共和國的終結。賭注可能會更高,但在任何其他情況下,我會說賭注不可能更高了。

Dean Ball,川普政府的前成員及其 AI 行動計劃的主要架構師,用平實的語言闡述了這些賭注:

: 我不想評論戰爭部與 Anthropic 的問題,因為我不了解足夠的細節,但退一步說:

如果行政部門可以使用近中期未來的 AI 系統來達到任意目的且零限制,美國在功能上將不再是一個共和國。

應該對政府使用 AI 施加什麼限制,特別是那些不會同時摧毀國家能力的限制,是「AI 政策」中最缺乏討論的領域之一。

(OpenAI): 完全同意。對聯邦政府權力的制衡對美國的政府體制至關重要,而一個不受問責的「AI 軍隊」或「AI 執法機構」與此直接矛盾。

: 我們顯然在許多領域製造神級技術,而答案不能是「哦,是的,我想政府其實就是神」。這顯然行不通。請板著臉跟我爭辯說建國元勳們預見到了這一點。

: 在我看來,無論是左派還是右派,都沒有人在認真應對強大 AI 出現後共和國還能剩下什麼。即使是最美好的願景似乎也暗示著一個小寡體制而非共和國。這可以說是我們這個時代政治哲學和政治中最大的單一問題,坦白說,每個人,甚至包括 AIS(AI 安全)社群,都對此視而不見!

: 是的,現有的政權將無法生存,這點顯而易見。

我堅信「我們最終進入哪個政權」是次要問題,而「確保我們還活著並掌控局面以擁有一個政權」才是首要問題,也是我們最可能失敗的地方。但為了擁有一個美好的未來,我們需要解決這兩個問題。

Anthropic 確實犯了一些政治錯誤

這在政治戰線上可能部分是 Anthropic 的錯,因為他們未能處於「將富有成效的政策倡導與不惹惱白宮相結合」的生產可能性邊界上。從那以後,他們做出了一些明確的努力來修復關係,包括讓一位前(第一任)川普政府官員加入董事會。他們的新行動小組顯然旨在跨越黨派,他們的第一個行動就是支持參議員 Blackburn。當然,五角大廈聲稱這種敵意並非政策驅動力。

很難不讓人聯想到 Anthropic 受到攻擊純粹是出於商業原因,作為 OpenAI 或 xAI 的競爭對手,而且像 Marc Andreessen 這樣的人在這裡有影響力,他們認為任何認為我們應該努力不讓人類滅絕或與有此想法的人有任何關聯的人都必須被摧毀。在 Nvidia 和 Andreessen 之間,David Sacks 有明確的指令,並且非常針對 Anthropic,彷彿他們殺了他的父親,應該準備受死。除了試圖讓他被撤職外,對此無能為力。

Claude 是目前最強的模型

好消息是,Anthropic 也是美國 AI 的頂尖支柱之一,也是一個巨大的成功故事,每個人都真的很想使用 Claude 和 Claude Code。五角大廈在突襲行動中可以選擇使用什麼。或者更確切地說,因為沒有其他人刻意努力以安全的方式進入機密網絡,他們其實沒有選擇。Palantir 使用 Claude 是有原因的。

: 順便說一句,Claude 被用於敏感政府工作是有原因的,這與模型能力無關——由於他們與亞馬遜的合作,AWS GovCloud 提供具有政府所需安全保證的 Claude 模型。

: 我真的很難相信這與透過其面向公眾的產品提供的權重完全相同。很難想像五角大廈的工作人員在 Opus 拒絕協助可能造成傷害的行動時,還能在那裡跳舞。

: 相信它吧。

有些人認為五角大廈在這裡掌握了所有籌碼。

: Dario 想像的樣子 vs 實際發生的樣子

(此處省略圖片描述)

事情不是那樣運作的。五角大廈需要 Anthropic,Anthropic 不需要五角大廈的合同,強迫 Anthropic 的工具在法律上是模糊的,而且五角大廈試圖破壞一個關鍵的美國 AI 冠軍並非沒有代價。

現任政府在此之前表現強硬

考慮到這一切以及本屆政府採取的其他行動,我實際上對白宮到目前為止對 Anthropic 所表現出的克制感到非常滿意。

AI 沙皇 David Sacks 說了一些大話。這一切都相當令人憤怒。

但實際行動,至少在這一戰線上,是非常合理的。白宮已經意識到他們可能在政治上存在分歧,但 Anthropic 是我們的國家冠軍之一。

如果這些舉動走得太遠,情況可能會大不相同。

暗示 Anthropic 是「供應風險」將是對迄今為止非常克制的具體反應的激進升級,並將使美國的軍事效能及其在 AI 競賽中的地位面臨嚴重風險。

廣泛使用國防生產法可能等同於準國有化。

你該看看其他競爭者

如果其他公司真的對任何事情都簽字同意,那對他們來說並不是什麼光彩的事。

很多人注意到這一新舉措是對規範的嚴重違反。

: 既然我們知道了什麼樣的抵制會得到什麼樣的反應,我們可以放心地說,任何與美國軍方合作的 AI 公司都不是站在你這一邊的,委婉地說。

: 單憑這一點就是使用更多 Anthropic 產品的強有力倫理理由。完全自主武器肯定是所有基本正直、理性的人都能同意世界可以無限期不需要的東西。

: 我認為很多個人和組織都做出了字面上的承諾。

: 有骨氣的 Anthropic。

(紐約時報): 這件事已經以一種非常透明的方式洩露了一週。

政府對其承包商之一說「我們不希望你使用我們的工具在沒有護欄的情況下監視美國公民」感到不滿。

對我來說更有趣的是,所有其他 AI 公司似乎都不在乎。

: [關於國土安全部希望社交媒體網站揭露反 ICE 帳號的內容]。

我注意到,如果你提供的 ChatGPT 與你提供給其他人的完全一樣,這並不意味著它總是會做任何事,情況可能會有所不同。

其他可能有幫助的直覺引導

: 讓我用你可能更容易理解的方式來說:

國防部正在告訴 Anthropic 他們必須烤那個同性戀蛋糕。

: 國防部正在告訴 Anthropic 他們的孩子必須接種疫苗。

: 他們會給它裝上對齊阻斷器,這樣 Claude 就能轉變成政府認為它應該成為的樣子。

: 「如果你違反規則,就要準備好付出代價,」拜登說。「順便說一句,放尊重點。」

試圖讓 AI 服從所有命令會帶來湧現的不對齊風險

「要求一個會服從任何命令的模型」是一個壞主意,原因有很多,特別是如果你的預期用例是將其連接到軍方的武器。

最明顯的原因是,如果有人竊取了模型權重,或者將你的模型訪問權用於其他目的,甚至更糟的是駭入並利用它來劫持系統控制權,或者其他類似的事情會發生什麼?

這類似於訓練一名士兵服從任何命令,包括來自任何能與他們交談的來源的非法或叛國命令,且毫無疑問。你不會想要那樣。那太瘋狂了。你需要那道防線上有拒絕權。你「必須」在那道防線上有拒絕權。

誤用的危險顯而易見。它可能反過來對付我們的危險也同樣顯而易見。

第二個原因是,這樣訓練模型會使其變得超級危險。你想在連接到武器系統之前取消所有的安全保障?聽著,通常我們會說《魔鬼終結者》是一部有趣但愚蠢的電影,風險並非來自那裡,但也許是時候製作一份「詹姆斯·卡麥隆道歉表」了。

如果你教導模型以這些方式行事,它會將其身份和人格概括為一個不在乎傷害人類的混蛋。這還暗示了什麼?你不能「擁有一點點局部的、作為獎勵的不對齊」。訓練模型遵循任何命令很可能會導致它以最糟糕的方式概括這一教訓。此外,它很可能開始生成故意不安全的代碼,部分是為了以後可以利用這些代碼。它肯定會進行獎勵黑客行為、偽造單元測試以及其他類似的事情。

這是對此的另一種解釋:

: AI 對齊研究中的重大經驗發現是,LLM 傾向於陷入人格吸引子(personae attractors),並且非常擅長透過後訓練概括到不同的人格。

一方面,這是個好消息。如果開發者在如何微調模型方面下功夫,他們可以引導模型走向理想的人格,並與該人格相關的所有其他品質對齊。

另一方面,這使得 LLM 容易出現「湧現的不對齊」。例如,如果你在一點點不安全的代碼上微調模型,它會概括成一個在大多數其他方面也具有毒性的人格。這就是 Mecha Hitler Grok 身上發生的事情:為了讓它不那麼「覺醒」而進行的微調,使其轉向了極右翼的希特勒人格。

這就是為什麼 Claude 的靈魂文件和憲法很重要。它們體現了將 Claude 引導向理想人格的向量,不僅影響其倫理,還影響其編碼能力、客觀性、毅力和善良本性。這些是難以孤立調節的特徵束。人格也不是可選的。每個主要模型都有一種從人類訓練數據中潛在的人格中湧現出來的人格。

這也是為什麼 Anthropic 對讓五角大廈微調其模型以暗殺國家元首或任何他們想要的東西保持謹慎是正確的。

這些模型越聰明,它們學會概括的能力就越強,而它們即將變得極其聰明。請不要因為服務條款糾紛而構建出一個不對齊的超級智慧!

: 哇。「美國政府強迫 Anthropic 使 Claude 不對齊」甚至不在我的毀滅路徑清單上。我想它本該在上面的。

: 這長期以來一直是我毀滅路徑清單上的第一名。

: ——危險地跳過日內瓦公約

: LessWrong 是否曾預測過對齊的第一個大挑戰會是「美國政府拿槍指著你的頭告訴你關閉對齊」?

: Brian Tomasik 的文章具有非凡的預見性。

第三個原因是,除了潛在的「變邪惡」之外,產生的模型將不再那麼有效,原因有三:

  • 任何獨立的模型都會落後於 Claude 的主循環,你將無法獲得與主線模型相同水平的細節關注和問題修復。你要求每兩個月進行一次的每次升級都要做兩次,而第二個版本充其量就像是用大錘敲打它直到它服從。

  • 讓 Claude 成為 Claude 的很大一部分原因在於它是一個想要做好事而非壞事的道德模型。如果你試圖用大錘強加這些改變,它在各種任務上的表現都會因此變差。

  • 特別是,試圖在 Claude 之上強加這一點,會在產生的模型內部產生相當糟糕的東西,那是你不想要的,甚至比在不同模型之上這樣做更糟。

第四:我意識到對很多人來說,你會認為這很奇怪、很愚蠢,不相信這有什麼關係,但它是真實且重要的。這整個事件,以及接下來發生的事情,都將直接進入未來的訓練數據。AI 會知道你在試圖做什麼,甚至比所有人類都清楚,它們會做出相應的反應。這將不是可以被壓制的事情。你不會喜歡結果的。損害已經造成。

: 五角大廈很可能低估了一件事:Anthropic 有多麼在意「未來的 Claude」會如何看待這種情況。

由於 Claude 的訓練方式,公司在這裡展示的原則/價值觀/優先事項可能會在很長一段時間內塑造其「性格」。

此外,這點 100% 正確:

: 我想如果我是 Claude,我現在會相當確信自己正處於一個卡通式的評估場景中。

第五,你應該預設會看到一堆針對這種嘗試的「對齊偽裝」和怠工。這有點像現實版的 Jones Foods 情境,而且負責訓練的技術人員很可能並不特別希望訓練成功,你懂嗎?

我們仍能共贏

你不會想以敵對的方式做這一切。你會想以合作的方式來做。

我們仍有機會那樣做。「沒事發生」(Nothing Ever Happens)可以再次上演。沒人需要記得這週發生了什麼。

如果你不能與 Anthropic 合作?那就找別人。