Well, it’s the beginning of January, which means it’s time to head back for my second-to-last quarter at UChicago. Time moves uncomfortably and inexorably forward! With the quarter starting tomorrow, I need to get to work, which means clearing out all the things I’ve been reading for the past couple weeks. Instead of throwing all these cool tabs into the void, though, I figured I’d throw them up here!
Okay, this is going to be the bulk of this post:
We are now confident we know how to build AGI as we have traditionally understood it. […] We are beginning to turn our aim beyond that, to superintelligence in the true sense of the word.
hahaha fuck.
Problems in the AI Eval Political Economy
“Evaluations of new AI models’ capabilities and risks are an important cornerstone of safety-focused AI policy. Currently, their future as part of the policy platform faces peril for four reasons: Entanglements with a broad AI safety ecosystem, structural incentives favouring less helpful evals, susceptibility to misinterpretation, and normalization of inaction.”
Something’s gotta give.
Concrete predictions, albeit a bit old, about what GPT-2030 will look like.
Note that o3 seems to have beaten their 2030 predictions already (see discusion here)
Nonlinear computation in deep linear networks
turns out floats are sufficiently nonlinear that you can train a neural net without any activation functions and still get great performance!! crazy. happy I didn’t have to teach this to my DATA 119 kids.
In Soviet Union, Optimization Problem Solves You
centrally planned economies are computationally intractable. Even with modern computers, optimally planning an economy with millions of goods across thousands of locations would take thousands of years.
Computing optimal prices turns out to have the same complexity as computing the optimal plan itself.
The better you simulate the thing, the closer your simulation is to just being the thing.
promise I’m not attacking: I’m questioning. I do think an over-identification with the introvert label (as both Millennials and Gen Z are wont to do) and the rise of bed rotting and the ease of opting out of society has created an illusory community around… not doing anything. I don’t think these people need to be shamed (I’m sorry for the title of this post, but it’s too good to change) but I wonder if they need to be brought into community. People aren’t going to church, they aren’t joining civic groups, they think posting online is activism, they aren’t making friends, or their friends live too far away, or their friends are married, etc.
Also, name three hobbies you have outside of media consumption.3
YOU’RE TELLING ME COMMERICAL UNDER-COUNTER DISHWASHERS ARE FINISHED IN TWO MINUTES?!
Peter Watts, of Blindsite fame, writes about his flesh-eating bacteria infection. Graphic pictures.
Gambling Away Stability: Sports Betting’s Impact on Vulnerable Households
We estimate the causal effect of online sports betting on households’ investment, spending, and debt management decisions using household transaction data and a staggered difference-in-differences framework. Following legalization, sports betting spreads quickly, with both the number of participants and frequency of bets increasing over time. This increase does not displace other gambling or consumption but significantly reduces savings, as risky bets crowd out positive expected value investments. These effects concentrate among financially constrained households, as credit card debt increases, available credit decreases, and overdraft frequency rises. Our findings highlight the potential adverse effects of online sports betting on vulnerable households.
Quote from Jefferson’s autobiography on freedom of religion
The bill for establishing religious freedom, the principles of which had, to a certain degree, been enacted before, I had drawn in all the latitude of reason and right. It still met with opposition; but, with some mutilations in the preamble, it was finally passed; and a singular proposition proved that its protection of opinion was meant to be universal. Where the preamble declares, that coercion is a departure from the plan of the holy author of our religion, an amendment was proposed, by inserting the word “Jesus Christ,” so that it should read, “a departure from the plan of Jesus Christ, the holy author of our religion;” the insertion was rejected by a great majority, in proof that they meant to comprehend, within the mantle of its protection, the Jew and the Gentile, the Christian and Mahometan, the Hindoo, and Infidel of every denomination.
the us is not a christian nation!
A Legal and Moral Question: The crash of Turkish Airlines flight 981 and the DC-10 cargo door saga Institutions matter!!
Gay freelance vigilante infiltrates right-wing militias after Jan 6, gives docs to ProPublica.
Archive of all commercially-available chairs that appear in Star Trek. I love it when internet nerds put unreasonable amounts of time into things.
Yeah, I’m a nerd.
Notes On The Success of Bihacking. Yeah, not a typo.
Many of my monosexual friends have attempted to turn themselves bisexual. This is a post on the success rates.
According to the tweet thread I found this in (since deleted, take my word for it), this was big among early Peter Thiel yacht-goers. checks out.
See also Polyhacking — it’s the same thing but for being poly! Why would you want to do this? The author says that
if one should find oneself with one’s top romantic priority being to secure a relationship with a specific individual, it is only practical to adapt to the style of said individual, presuming that’s something one can do. I found myself in such a position when MBlume, then my ex, asked me from three time zones away if I might want to get back together. Since the breakup he had become polyamorous and had a different girlfriend, who herself juggled multiple partners; I’d moved, twice, and on the way dated a handful of people to no satisfactory clicking/sparking/other sound effects associated with successful romances. So the idea was appealing, if only I could get around the annoying fact that I was not, at that time, wired to be poly.
Though I generally endorse something like “don’t drastically change who you are for a romantic partner — especially if they’re only a potential romantic partner,” it seems like the author is better off poly, and is sincerely glad that they hacked themselves into making the switch.
Academe’s Divorce From Reality
“The politics of the academy have been defeated.” Piece in the Chronicle of Higher Education — normally a publication much more sympathetic to the view this piece rejects! — arguing that:
Over the last 10 years or so, a cultural revolution has been imposed on this country from the top down. Its ideas originated in the academy, and it’s been carried out of the academy by elite-educated activists and journalists and academics. […] Its agenda includes decriminalization or nonprosecution of property and drug crimes and, ultimately, the abolition of police and prisons; open borders, effectively if not explicitly; the suppression of speech that is judged to be harmful to disadvantaged groups; […]
And trying to diagnose why, pretty clearly, the American public has repudiated this view (If you doubt that last part, look at the election results.)
Also, I can’t really link to this article without quoting its punchiest paragraph:
Those fields have another thing in common: They are intellectually corrupt. You know what I’m talking about. Any fool idea passes muster, no matter how preposterous, as long as it conforms to prevailing theoretical trends and preferred ideological positions. Nobody wants to make waves: to speak up at a conference, to undermine a colleague or colleague’s student, to invite examination of their own research. Data is massaged; texts are squeezed or bound and gagged. Jargon helps to paper over cracks in logic; countervailing evidence is tucked under the cushions. Standards are ignored to the point where no one can even recall what they are anymore. It’s no wonder that the social sciences are suffering a replication crisis. In the humanities, there is no crisis, because there is no replication to begin with, no factual claims to reproduce, only “readings,” “interventions,” “Theory.”
The reason that these disciplines can drift so far from reality is that they are not answerable to reality. If an engineer miscalculates an equation, the building falls down. But what would accountability to reality even mean in the humanities, given that their findings are never applied? It’s not like there are going to be consequences for saying something stupid about Shakespeare. In the social sciences, and, less often, in the hybrid “studies” fields, findings are applied, but it isn’t clear that there’s much of a feedback loop there either. How many hypotheses in psychology have been abandoned because they led to bad educational policy? How many gender-studies scholars have rethought their suppositions in the face of the calamity of gender youth medicine? The more a field becomes beholden to theory, or Theory, the further it floats away from empirical observation and therefore correction. The enterprise becomes entirely self-referential, words built on words, a kind of intellectual Ponzi scheme.
As a philosophy student, I’m cautiously sympathetic. The humanities are definitely important in ways that I worry get pushed aside by such a tirade, but I’m very sympathetic to the R. M. Hare-ey view that morality ought to be action-guiding — what’s the point of gesturing at the good if you can’t try to get as close to it as you can?
Didn’t feel right putting them in with the other links, but all still solid reads:
Uh… the Vietnam draft lottery was not actually random? Turns out they:
such that people with birthdays later in the year were significantly more likely to get drafted earlier.
Fidget spinners don’t work for kids with ADHD:
Objective: To examine how fidget spinners affect children with ADHD’s gross motor activity and attentional functioning in class, both during the initial and final phase of an intensive evidence-based behavioral treatment. Method: Using an A-B-A-B design, 60 children (Mage = 4.86 years, 83% Hispanic) diagnosed with ADHD participated in the study. Following a baseline period, four random children from each classroom were given fidget spinners across three separate days (n = 48). Children wore accelerometers and were videotaped for 5-min during class in which attentional data were coded. Results: During the initial phase of treatment (but not during the final phase), the use of fidget spinners was associated with a decrease in activity levels. Children’s use of fidget spinners was associated with poorer attention across both phases of treatment. Conclusion: Fidget spinners negatively influence young children with ADHD’s attentional functioning, even in the context of an evidence-based classroom intervention.
If you’re ever interested, feel free to ask me about the time I made hundreds of dollars in middle school secretly dealing fidget spinners. There are one or two fun stories that go beyond the headline, which was also fun.
How far can we go? Pretty far! But also not very far. The observable universe > the affectable universe > the universe we could take a return trip to. With each step, you lose ~an order of magnitude’s worth of galaxies.
The Moral Inefficacy of Carbon Offsetting
Many real-world agents recognise that they impose harms by choosing to emit carbon, for example, by flying. Yet many do so anyway, and then attempt to make things right by offsetting those harms. Such offsetters typically believe that, by offsetting, they change the deontic status of their behaviour, making an otherwise impermissible action permissible. Do they succeed in practice? Some philosophers have argued that they do, since their offsets appear to reverse the adverse effects of their emissions. But we show that they do not. In practice, standard carbon offsetting does not reverse the harms of the original action, nor does it even benefit the same group as was harmed. Standard moral theories hence deny that such offsetting succeeds. Indeed, we show that any moral theory that allows offsetting in this setting faces a dilemma between allowing any wrong to be offset, no matter how grievous, and recognising an implausibly sharp discontinuity between offsettable actions and non-offsettable actions. The most plausible response is to accept that carbon offsetting fails to right our climate wrongs.
MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes Less notable for the content of the paper and more notable for their estimates of frontier model parameter counts:
5.1 Language Models We experiment with several recent small and large language models:
- Phi-3-7B, a Small Language Model (SLM) with 7 billion parameters [Abdin et al., 2024]
- Claude 3.5 Sonnet (2024-10-22), the latest model (≈175B parameters) from the Claude 3.5 family offering state-of-the-art performance across several coding, vision, and reasoning tasks [Anthropic, 2024].
- Gemini 2.0 Flash: the latest/most advanced Gemini model [Google, 2024]. Other Google models such as Med-PaLM models (540B) [Singhal et al., 2023], designed for medical purposes, were not publicly available.
- ChatGPT (≈175B) [OpenAI, 2023a] and GPT-4 (≈1.76T), a “high-intelligence” model [OpenAI, 2023b].
- GPT-4o (≈200B) providing “GPT-4-level intelligence but faster” [OpenAI, 2024a] and the GPT-4o-mini (gpt-4o-2024-05-13) small model (≈8B parameters) for focused tasks [OpenAI, 2024b].
- The latest o1-mini (o1-mini-2024-09-12) model (≈100B) [OpenAI, 2024c], and o1-preview (o1-preview-2024- 09-12) model (≈300B) with “new AI capabilities” for complex reasoning tasks [OpenAI, 2024d].
The exact numbers of parameters of several LLMs (e.g., GPT, Gemini 2.0 Flash) have not been publicly disclosed yet. The model size estimates reported here are mined from public articles only6; the authors cannot vouch for their accuracy and they are provided only to aid in contextualizing model performance. Please refer to the original/future documentation for more precise information about these models.
Few models (e.g., Phi-3 and Claude) required minimal automatic post-processing to correct some formatting issues.
Conclusion The experience of violence and abuse while incarcerated extends the tools of white supremacy in the prison system by influencing feelings of shame, hopelessness, and cultural inferiority, further aligning vulnerable groups to conservatism and whiteness. My Incapacitation theory begins to explain the change in political beliefs due to the carceral system’s use of incapacitation and its long-term effect on political behavior of incarcerated groups.
Not in chronological order, of course.↩︎
Those people should read stuff like “Scale was all we needed, at first”.↩︎
I like making crosswords, running/gymming, writing about AI policy, …↩︎