<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Blogs on kieranhealy.org</title>
    <link>https://kieranhealy.org/blog/</link>
    <description>Recent content in Blogs on kieranhealy.org</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-US</language>
    <lastBuildDate>Thu, 21 May 2026 15:50:40 -0400</lastBuildDate>
    
    <atom:link href="https://kieranhealy.org/blog/index.xml" rel="self" type="application/rss+xml" />
    
    
    
    
    <item>
      <title>Zero Sum Problems</title>
      <link>https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/</link>
      <pubDate>Thu, 21 May 2026 15:50:40 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/</guid>
      <description>&lt;p&gt;Over at &lt;a href=&#34;https://daringfireball.net&#34;&gt;Daring Fireball&lt;/a&gt;, John Gruber makes a passing observation about the Apple Sports app:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;I’ve got some gripes about certain specific aspects of Apple Sports. Like, where does one even &lt;em&gt;start&lt;/em&gt; to explain how much is wrong with &lt;a href=&#34;https://daringfireball.net/misc/2026/05/apple-sports-team-stats-wtf.png&#34;&gt;their zero-sum visualization of team stats&lt;/a&gt;? Has anyone ever even seen a presentation like that before? &lt;a href=&#34;https://kieranhealy.org/&#34;&gt;Anyone&lt;/a&gt;?&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;That &amp;ldquo;Anyone&amp;rdquo; link lands over here. Hi everyone! The team stats image &lt;em&gt;is&lt;/em&gt; quite confusing. It&amp;rsquo;s a summary of a game between the San Antonio Spurs and the Oklahoma City Thunder. I don&amp;rsquo;t know much about basketball, but I do know a bit about data visualization and in a pleasing coincidence my former student &lt;a href=&#34;https://www.linkedin.com/in/joshua-fink&#34;&gt;Josh Fink&lt;/a&gt; is the A-VP of Basketball Data Science for the Spurs. Here is the image that John objected to:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/apple-sports-team-stats-wtf.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/apple-sports-team-stats-wtf.png&#34;
         alt=&#34;Confusing Apple Sports team stats visualization.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;I had to look at it for a while as well.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I just finished driving a very long way up the side of the country, so I’m kind of tired. But even allowing for that, boy, this way of representing things really is quite confusing. Not being an Apple Sports user I had to look at it for a bit to understand what was happening. But, now that it has given me a headache, I can kind of see why whoever designed this ended up in the undoubtedly bad place they did.&lt;/p&gt;
&lt;p&gt;Before I get to why I have some sympathy for the designer, &lt;em&gt;why&lt;/em&gt; did I find this representation of these numbers so disorienting? It&amp;rsquo;s not just just because I&amp;rsquo;ve been driving for nine hours. John is right to call the picture a &amp;ldquo;Zero Sum&amp;rdquo; representation. The design &lt;em&gt;strongly&lt;/em&gt; suggests to the viewer that, within each row, we&amp;rsquo;re looking at each team&amp;rsquo;s share of a total. Each pair of black and blue lines seem to be vying for control of their whole row, with the longest line being the &amp;ldquo;winner&amp;rdquo; in each case.&lt;/p&gt;
&lt;p&gt;This sort of representation would make perfect sense for a measure that really
&lt;em&gt;was&lt;/em&gt; zero sum. Take an example from a properly good sport, like rugby. There,
like in basketball, to a first approximation a team either has the ball or it
doesn’t.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; But there&amp;rsquo;s no shot clock in rugby, and possession routinely gets
turned over without the game stopping. So, knowing that Team A had 65%
possession is not only informative, it also immediately entails that Team B had
35%. You could show that with a representation like one of the rows above.&lt;/p&gt;
&lt;p&gt;Literally none of the measures in the Basketball data above are zero-sum in this way. Both teams could shoot 100% from the free throw line, or zero percent. But because the first three measures shown are percentages, this reinforces the zero-sum impression given by the lines. It certainly did that in my case. But then, starting with Assists, the remaining rows are just absolute numbers. When I started looking at the absolute numbers, I got confused a second time by the length of the lines. &amp;ldquo;Oh so it&amp;rsquo;s not a share, it&amp;rsquo;s the value&amp;rdquo; I thought&amp;mdash;but no, they do correspond in terms of relative proportions to the teams share within each row. But they&amp;rsquo;re not really &lt;em&gt;shares&lt;/em&gt; they&amp;rsquo;re just &lt;em&gt;magnitudes&lt;/em&gt;. But they have to be shown in a fixed space and we want to make them relatively comparable somehow so &amp;hellip;  Argh.&lt;/p&gt;
&lt;p&gt;It would be nice if there were One Weird Trick to fully fix this figure. But I&amp;rsquo;m not sure that there is. For example, at a minimum we could redraw these numbers to reflect the fact that they&amp;rsquo;re not zero-sum. Keep each measure as a row (i.e. on the y-axis) but have the lines, or columns, be side by side within each category instead of facing off. Like this:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats1.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats1.png&#34;
         alt=&#34;Team Stats side by side for each measure.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Team Stats side by side for each measure.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This view at least lets you immediately see who &amp;ldquo;won&amp;rdquo; each measure. The viewer
can just directly compare the length of the bars in each category. &lt;a href=&#34;https://socviz.co/01-look-at-data.html#visual-tasks-and-decoding-graphs&#34;&gt;People are
really good at doing that
accurately.&lt;/a&gt;
In that sense it&amp;rsquo;s much less confusing than the original. But there&amp;rsquo;s still a
lot wrong with it. The core problem is that when we draw a graph like this,
we&amp;rsquo;re usually putting &lt;em&gt;the same kind of thing&lt;/em&gt; (e.g. countries, or religious
groups, or sports teams) on the y-axis, and then seeing how different their
scores are on some single measure (e.g. GDP, or number of adherents, or average
points scored per game), which we put on the x-axis. Maybe we use color to break
things out by some third measure as well.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; In
this case, I&amp;rsquo;ve just labeled the x-axis as generically as possible. &amp;ldquo;Value&amp;rdquo;
covers the range of all the measures. The lowest value is 5, in Largest Lead.
The highest is 88, in Free Throw %. But these numbers are not meaningfully
comparable. The graph encourages us to compare across as well as within
categories. But while within-category comparisons are meaningful, the
between-category ones are not. There were way more Bench Points than Blocks in
the game. But that is not a useful thing to know.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats2.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats2.png&#34;
         alt=&#34;Team Stats side by side and ordered from absolute highest to lowest, whatever that means.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Team Stats side by side and ordered from absolute highest to lowest, whatever that means.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Knowing who won each measure isn&amp;rsquo;t nothing. It can be informative about how the game went, maybe especially when a team won the game but &amp;ldquo;lost&amp;rdquo; on a number of the measures. If you really wanted to lean in to that aspect, you could sort of justify the zero-sum view, and maybe look for a way to sort and order by &amp;ldquo;how much&amp;rdquo; a team &amp;ldquo;won&amp;rdquo; each category. But again, what&amp;rsquo;s the right denominator for those measures? For instance, do we care about a team&amp;rsquo;s share of all Defensive Rebounds in the game? Or do we care about the share of Defensive Rebounds a team won relative to every opportunity it had to make a Defensive Rebound? How meaningful is ordering our rows by those kinds of shares? Even worse, some measures (notably Fouls) are &lt;em&gt;bad&lt;/em&gt; to &amp;ldquo;win&amp;rdquo;, so we&amp;rsquo;d have to do something about those.&lt;/p&gt;
&lt;p&gt;Our fundamental problem is that we just have two cases (the teams) and fifteen
different measures, or variables. Each variable, except for the three
percentages, is in effect on its own scale. There&amp;rsquo;s no direct way to make
comparisons across them. Sure, some of these measures are probably going to be
associated with one another&amp;mdash;e.g. Turnovers and Points Off Turnovers&amp;mdash;but the
numeric values aren&amp;rsquo;t directly comparable in general. If you know a lot about
basketball you might have some informative rules of thumb about each one of
these measures, or some of them in combination. But at that point the lines in
this particular graph are not going to be doing any work for you; you&amp;rsquo;ll just
end up looking directly at the numbers. If we had data on all these measures for
every NBA game for a whole season then we could of course do much more with
them, because then each measure would have a distribution across all games and
across all teams.&lt;/p&gt;
&lt;p&gt;As it is, the purpose of the &amp;ldquo;Stats&amp;rdquo; screen in Apple Sports is just to summarize
information from a single game. The other thing I could think of to do with the
numbers as kind of graph is something like this:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats3.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/21/zero-sum-problems/gruber-stats3.png&#34;
         alt=&#34;A back-to-back column chart.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A back-to-back column chart.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This is &lt;em&gt;marginally&lt;/em&gt; more helpful than the one before just because, again, it
gets rid of the unhelpful zero-sum look of the original. As I hope you can
immediately see, it creates many other difficulties. It also doesn&amp;rsquo;t do away
with the core problem. That problem is principally one of information design
rather than data visualization. What I mean is that what we&amp;rsquo;re trying to
organize is, in effect, fifteen pairs of related but fundamentally distinct
numbers. If we had fifteen &lt;em&gt;cases&lt;/em&gt; and two &lt;em&gt;variables&lt;/em&gt; things would be simple. But
with fifteen variables and two cases &amp;hellip; well, this is not the kind of thing you
can make a single effective and non-confusing graph out of. That&amp;rsquo;s why I kind of
sympathize with the designer. In a constrained space they have to show thirty
numbers (thirty two, including the score). Lots of information. A straight table
seems like it would be boring. Surely there&amp;rsquo;s some way to thematically integrate
the numbers in a visually appealing manner that brings out some of the
relationships across the rows. That&amp;rsquo;s what graphs do; it seems like the right
thing to reach for. But at its heart this information is not a graph. It just
sort of looks like one, and that ends up confusing people.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Modulo some measurement decisions about how to determine when possession is turned over while the ball is in play.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://socviz.co/05-more-on-geoms.html#fig-ch-05-organdata-06&#34;&gt;Here&amp;rsquo;s an
example&lt;/a&gt; of a graph with a categorical measure on the y-axis, a continuous measure on the x-axis, and an additional categorical feature shown with color.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Bad Weather and the Subway</title>
      <link>https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/</link>
      <pubDate>Sat, 02 May 2026 08:59:15 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/</guid>
      <description>&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/snow-in-nyc.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/snow-in-nyc.png&#34;
         alt=&#34;Two figures walking in the snow; trees in the distance.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Snow in Inwood, New York. Photograph by the author.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Recently I&amp;rsquo;ve been looking at hourly ridership data from the New York City Subway. Last time we learned that &lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/25/hourly-subway-station-flows/&#34;&gt;people go to work in the morning and come home in the evening&lt;/a&gt;, for example. (All together now: &amp;ldquo;Only in New York, baby!&amp;rdquo;) Today we&amp;rsquo;ll learn that bad weather makes people stay at home. Except, sometimes it doesn&amp;rsquo;t.&lt;/p&gt;
&lt;p&gt;Regular readers will recall that the subway system &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/02/19/mta-ridership/&#34;&gt;carries a &lt;em&gt;lot&lt;/em&gt; of passengers every day&lt;/a&gt;. The ridership data for the whole of 2025 represents just over 1.3 billion entries into the system via an OMNY tap or Metrocard. It&amp;rsquo;s available aggregated to hourly resolution by station complex. With that data in hand, we can calculate average hourly ridership for every day of the week. This gives us a profile of what, for example, a Monday or a Wednesday typically looks like, by hour. When calculating the average day-of-the week profile we exclude holidays and the like.&lt;/p&gt;
&lt;p&gt;Meanwhile, the National Weather Service provides data on severe weather events that affected the New York City region in 2025. We could get more fine-grained if we wanted to, but for now we&amp;rsquo;ll just use the &lt;a href=&#34;https://www.weather.gov/okx/stormarchive&#34;&gt;general list of events&lt;/a&gt; the NWS provides. Then we plot the Subway ridership profile for that specific date against the average profile for whatever day of the week the event happened on.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/rhythms_2025_weather.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/rhythms_2025_weather.png&#34;
         alt=&#34;Small multiple showing generally suppressive relatinship between subway ridership and adverse weather days in 2025.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Bad weather suppresses Subway ridership, in general. But not always.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The gray lines are the baseline. The red ones are the bad weather day. The basic shape of the gray lines (and many of the red ones) is set by the rhythm of daily life. The sharp double-peak pattern is what someone I&amp;rsquo;ve shown too many of these graphs to has taken to calling &amp;ldquo;The Giant Cat-Ears of Employment&amp;rdquo;. The cat-ear shapes vary by work day (which might be the topic of another post), but are most sharply-contrasted with the weekends, which look more like little hillocks or &lt;a href=&#34;https://en.wikipedia.org/wiki/Drumlin&#34;&gt;drumlins&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We can see a few different cases in the panels. First are days when the weather event put no dent at all in people&amp;rsquo;s day. This is because &lt;del&gt;of the incredible toughness and resilience of New Yorkers, something they are surprisingly very modest about&lt;/del&gt; even though there was a weather event in the region that day, it just didn&amp;rsquo;t impinge on the city much, or at all. The light snow on February 11th or the heavy rain on March 6th are examples here. People just continued to go about their business.&lt;/p&gt;
&lt;p&gt;Second are cases where there&amp;rsquo;s a lot of travel suppression but it&amp;rsquo;s not really&amp;mdash;or not wholly&amp;mdash;the weather that&amp;rsquo;s responsible. The winter storm on Friday December 26th is a case of this. Bad weather; strongly suppressed travel profile; but that&amp;rsquo;s not a regular Friday. Many people were staying at home anyway, because it&amp;rsquo;s the day after Christmas.&lt;/p&gt;
&lt;p&gt;Third are cases where the weather does seem to have suppressed travel. These are days like the snow on January 19th, or the shitty weather on Sunday February 16th. These events look like they made people stay at home. Some of these are more severe than others. The strongest example is the flash flooding on Thursday July 31st. That happened in the back half of the day and affected the evening commute directly.&lt;/p&gt;
&lt;p&gt;Our fourth and final category is my favorite one. Sometimes snow makes no difference at all, especially if it&amp;rsquo;s on a workday. Sometimes it&amp;rsquo;s snowy on the weekend but you&amp;rsquo;re kind of sick of it, maybe because it&amp;rsquo;s late in the winter, so you&amp;rsquo;re either going about your business as usual or you&amp;rsquo;re just staying indoors. But there&amp;rsquo;s another kind of snow day.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/rhythms_2025_weather_storm_dec13.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/05/02/bad-weather-and-the-subway/rhythms_2025_weather_storm_dec13.png&#34;
         alt=&#34;A close up of Dec 13th and 14th, when the first snow of the season fell and it made people want to go outside.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Let&amp;rsquo;s go exploring.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The weekend of &lt;a href=&#34;https://www.weather.gov/okx/20251213_14&#34;&gt;December 13th and 14th 2025&lt;/a&gt; brought the city&amp;rsquo;s &lt;a href=&#34;https://weather.com/news/news/2025-12-14-first-snow-new-york-city&#34;&gt;first measurable snow of the year&lt;/a&gt;, and in decent amounts, too&amp;mdash;&lt;a href=&#34;https://www.weather.gov/okx/20251213_14&#34;&gt;between four and eight inches of accumulation&lt;/a&gt;. Reports remarked on how long it had been in arriving. The result was that, over the weekend, ridership on the subway went &lt;em&gt;up&lt;/em&gt;. Maybe on the Saturday it was to go out and buy the mandatory bread, milk, and eggs.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;  But maybe it was also just to be out in the snow. The next day, the people who didn&amp;rsquo;t have to go work slept in as usual. But that day, too, across the afternoon, more people than usual headed outside and took the subway somewhere. I&amp;rsquo;d like to think a bunch of them had a sled under their arm.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Maybe, at least for some New Yorkers, it was because it made more sense to take the subway than drive. Though this probably wouldn&amp;rsquo;t be all that many people. It&amp;rsquo;d be somewhat possible to investigate this with the data at hand, especially if e.g. outlying stations showed higher ridership rates.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Hourly Subway Station Flows</title>
      <link>https://kieranhealy.org/blog/archives/2026/04/25/hourly-subway-station-flows/</link>
      <pubDate>Sat, 25 Apr 2026 11:12:39 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/04/25/hourly-subway-station-flows/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://socviz.co/08-polishing.html#saying-no-to-pie&#34;&gt;Pie charts are bad&lt;/a&gt;, as
any fule kno. We&amp;rsquo;re not as good at judging relative differences between angles
and areas as we are at judging relative differences in lengths on a common
baseline. This is especially true when we have more than two things to compare
at the same time. So, as a rule, you shouldn&amp;rsquo;t use them. You should figure out
some other way of viewing your data instead. On the other hand, I just made 424
animated pie charts because if you&amp;rsquo;re going to break a rule you should break
it good and hard.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/25/hourly-subway-station-flows/subway-map.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/25/hourly-subway-station-flows/subway-map.png&#34;
         alt=&#34;A view of the New York City Subway System (excluding the SIR). We&amp;#39;ll animate this in a minute.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A view of the New York City Subway System (excluding the SIR). We&amp;rsquo;ll animate this in just a minute.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The New York City Subway system is very large and carries &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/02/19/mta-ridership/&#34;&gt;a &lt;em&gt;lot&lt;/em&gt; of passengers
every day&lt;/a&gt;. The
&lt;a href=&#34;https://www.mta.info&#34;&gt;MTA&lt;/a&gt; makes quite a bit of data available about the
subway, including data on hourly flow through the system. Now, the MTA can&amp;rsquo;t
track individual pathways people take through the subway. If you use an &lt;a href=&#34;https://omny.info&#34;&gt;OMNY
card&lt;/a&gt; (or before that, a Metrocard) to enter the system, this
signals the start of a trip from some specific station or station complex. But
unlike some systems, you don&amp;rsquo;t need to &amp;ldquo;tag out&amp;rdquo; of the subway, you just exit
through a turnstile. So the system doesn&amp;rsquo;t know where you exit it. In addition,
while many stations are just on a single line, some (like 34 St/Penn Station, or
Fulton Street) are station complexes that serve many lines and allow transfers
between them.&lt;/p&gt;
&lt;p&gt;However, the MTA does publish hourly &lt;a href=&#34;https://data.ny.gov/Transportation/MTA-Subway-Origin-Destination-Ridership-Estimate-2/y2qv-fytt/about_data&#34;&gt;Origin-Destination
estimates&lt;/a&gt;
for all pairs of stations. These are their &lt;a href=&#34;https://data.ny.gov/api/views/y2qv-fytt/files/c912f0c9-7371-44c9-a7a3-95c5389b82fe?download=true&amp;amp;filename=MTA_SubwayOriginDestinationRidershipEstimate_Overview.pdf&#34;&gt;best
guess&lt;/a&gt;
about the flow of traffic from any particular station to any other. Because
there are so many combinations, visualizing that sort of data is quite tricky.
Even then, you don&amp;rsquo;t get information about &lt;em&gt;routes&lt;/em&gt; through the system, just
start and end points. Transit analysts and planners can go further by
introducing some further assumptions about subway users. For example we might assume that commuters take the most efficient route between any given pair of entry and exit stations, and build from there to a picture of flow through the system.&lt;/p&gt;
&lt;p&gt;I do something rather more simple here. I use the MTA&amp;rsquo;s hourly
origin-destination estimates and aggregate them on a station-by-station basis to
calculate in-and-out flows across 424 subway stations or station
complexes. These specific numbers are averaged over all Mondays in 2025. For
each hour of the we calculate the total passenger volume at the station, and the
share of that volume that are estimated arrivals and departures. Then we draw a
pie chart for each station, coloring it yellow for departures,
purple for arrivals. The circle size reflects total volume and the pie slice
proportions show the flow balance.&lt;/p&gt;
&lt;p&gt;The flow data is pretty bulky. The original dataset has about 121 million rows. But working with it is pretty straightforward, thanks to the magic of parquet files, &lt;a href=&#34;https://duckdb.org&#34;&gt;duckdb&lt;/a&gt;, and &lt;a href=&#34;https://duckplyr.tidyverse.org&#34;&gt;duckplyr&lt;/a&gt;. Having patiently downloaded the data via its API, I put it in a parquet file. The CSV is about 17GB but the parquet file boils it down to 1.5GB. Then I made a small R package that bundled that data with a few convenience functions. This lets me use the data without copying it into any single project. So I can write, e.g.,&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;nycsubwayodr&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;nyc_subway_odr&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A duckplyr data frame: 15 variables&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;     year month day_of_week hour_of_day timestamp           day_of_month origin_station_complex_id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;int&amp;gt; &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;             &amp;lt;int&amp;gt; &amp;lt;dttm&amp;gt;                     &amp;lt;int&amp;gt;                     &amp;lt;int&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1  2025     1 Monday                1 2025-01-06 01:00:00            6                       189&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2  2025     1 Monday                1 2025-01-06 01:00:00            6                       313&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3  2025     1 Monday                1 2025-01-06 01:00:00            6                       611&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4  2025     1 Monday                1 2025-01-06 01:00:00            6                       125&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5  2025     1 Monday                1 2025-01-06 01:00:00            6                       313&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6  2025     1 Monday                1 2025-01-06 01:00:00            6                       154&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7  2025     1 Monday                1 2025-01-06 01:00:00            6                       167&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8  2025     1 Monday                1 2025-01-06 01:00:00            6                       612&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9  2025     1 Monday                1 2025-01-06 01:00:00            6                       272&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10  2025     1 Monday                1 2025-01-06 01:00:00            6                       167&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ more rows&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 8 more variables: origin_station_complex_name &amp;lt;chr&amp;gt;, origin_latitude &amp;lt;dbl&amp;gt;, origin_longitude &amp;lt;dbl&amp;gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; #   destination_station_complex_id &amp;lt;int&amp;gt;, destination_station_complex_name &amp;lt;chr&amp;gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; #   destination_latitude &amp;lt;dbl&amp;gt;, destination_longitude &amp;lt;dbl&amp;gt;, estimated_average_ridership &amp;lt;dbl&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;From there, we lazily query the data and duckdb does the work of doing the calculations. The whole table is never loaded into your R session, and duckdb is very fast. From there, we take our hourly flow summaries, join them to a tibble of station and line data, and export the result to some JSON files that &lt;a href=&#34;https://d3js.org&#34;&gt;D3js&lt;/a&gt; animates for us.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the result. There are three views. Initially, you see just the schematic subway map. If you click the &amp;ldquo;Map&amp;rdquo; button in the top left, it will switch to the ticking pie-chart view, which puts a pie on every station complex, with each tick being an hour of the day. The pies pile up on one another in the geographic view (in a not wholly uninformative way), but click again to have them expand to a somewhat more abstracted, force-directed network view of the system. Then click again to go back to the map. You can hover over or tap on nodes to get information about the bit of data it&amp;rsquo;s currently showing.&lt;/p&gt;
&lt;link rel=&#34;stylesheet&#34; href=&#34;subway-transition.css&#34;&gt;
&lt;div id=&#34;odr-controls&#34; style=&#34;display:flex;gap:8px;align-items:center;margin-bottom:8px;font-family:&#39;Helvetica Neue&#39;,Helvetica,sans-serif&#34;&gt;
  &lt;button id=&#34;mode-btn&#34; title=&#34;Cycle: Map → Net Flow → Network&#34;
    style=&#34;background:var(--bs-btn-bg,#f8f9fa);border:1px solid var(--bs-btn-border-color,#dee2e6);border-radius:4px;padding:4px 10px;cursor:pointer;font-size:12px&#34;&gt;Map&lt;/button&gt;
  &lt;button id=&#34;theme-btn&#34; title=&#34;Toggle light/dark theme&#34;
    style=&#34;background:var(--bs-btn-bg,#f8f9fa);border:1px solid var(--bs-btn-border-color,#dee2e6);border-radius:4px;padding:4px 8px;cursor:pointer;font-size:14px&#34;&gt;
    &lt;svg width=&#34;14&#34; height=&#34;14&#34; viewBox=&#34;0 0 384 512&#34; fill=&#34;currentColor&#34; style=&#34;vertical-align:-2px&#34;&gt;&lt;path d=&#34;M223.5 32C100 32 0 132.3 0 256S100 480 223.5 480c60.6 0 115.5-24.2 155.8-63.4c5-4.9 6.3-12.5 3.1-18.7s-10.1-9.7-17-8.5c-9.8 1.7-19.8 2.6-30.1 2.6c-96.9 0-175.5-78.8-175.5-176c0-65.8 36-123.1 89.3-153.3c6.1-3.5 9.2-10.5 7.7-17.3s-7.3-11.9-14.3-12.5c-6.3-.5-12.6-.8-19-.8z&#34;/&gt;&lt;/svg&gt; /
    &lt;svg width=&#34;14&#34; height=&#34;14&#34; viewBox=&#34;0 0 512 512&#34; fill=&#34;currentColor&#34; style=&#34;vertical-align:-2px&#34;&gt;&lt;path d=&#34;M361.5 1.2c5 2.1 8.6 6.6 9.6 11.9L391 121l107.9 19.8c5.3 1 9.8 4.6 11.9 9.6s1.5 10.7-1.6 15.2L446.9 256l62.3 90.3c3.1 4.5 3.7 10.2 1.6 15.2s-6.6 8.6-11.9 9.6L391 391 371.1 498.9c-1 5.3-4.6 9.8-9.6 11.9s-10.7 1.5-15.2-1.6L256 446.9l-90.3 62.3c-4.5 3.1-10.2 3.7-15.2 1.6s-8.6-6.6-9.6-11.9L121 391 13.1 371.1c-5.3-1-9.8-4.6-11.9-9.6s-1.5-10.7 1.6-15.2L65.1 256 2.8 165.7c-3.1-4.5-3.7-10.2-1.6-15.2s6.6-8.6 11.9-9.6L121 121 140.9 13.1c1-5.3 4.6-9.8 9.6-11.9s10.7-1.5 15.2 1.6L256 65.1 346.3 2.8c4.5-3.1 10.2-3.7 15.2-1.6zM256 160a96 96 0 1 0 0 192 96 96 0 1 0 0-192z&#34;/&gt;&lt;/svg&gt;
  &lt;/button&gt;
  &lt;div id=&#34;odr-hour-slider-wrap&#34; style=&#34;display:none;flex-direction:column;gap:2px&#34;&gt;
    &lt;div style=&#34;display:flex;align-items:center;gap:4px&#34;&gt;
      &lt;button id=&#34;play-btn&#34; title=&#34;Play/pause&#34; style=&#34;background:#f0f0f0;border:1px solid #999;border-radius:4px;padding:5px 8px;cursor:pointer;line-height:1;display:flex;align-items:center&#34;&gt;
        &lt;svg id=&#34;play-icon&#34; width=&#34;12&#34; height=&#34;12&#34; viewBox=&#34;0 0 320 512&#34; fill=&#34;#333&#34;&gt;&lt;path d=&#34;M48 64C21.5 64 0 85.5 0 112V400c0 26.5 21.5 48 48 48H80c26.5 0 48-21.5 48-48V112c0-26.5-21.5-48-48-48H48zm192 0c-26.5 0-48 21.5-48 48V400c0 26.5 21.5 48 48 48h32c26.5 0 48-21.5 48-48V112c0-26.5-21.5-48-48-48H240z&#34;/&gt;&lt;/svg&gt;
      &lt;/button&gt;
      &lt;input id=&#34;hour-slider&#34; type=&#34;range&#34; min=&#34;0&#34; max=&#34;23&#34; step=&#34;1&#34; value=&#34;8&#34; style=&#34;width:340px&#34;&gt;
    &lt;/div&gt;
    &lt;div id=&#34;tick-labels&#34; style=&#34;display:flex;justify-content:space-between;font-size:10px;color:#666;margin-left:24px;width:340px&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div id=&#34;odr-legend&#34; style=&#34;display:none;align-items:center;gap:6px;font-size:11px;color:#666;margin-left:8px&#34;&gt;
    &lt;svg width=&#34;24&#34; height=&#34;24&#34; viewBox=&#34;-12 -12 24 24&#34;&gt;
      &lt;path id=&#34;legend-dep&#34; d=&#34;M0,0 L0,-10 A10,10 0 1,1 -5.88,8.09 Z&#34; fill=&#34;#fde725&#34; stroke=&#34;#000&#34; stroke-width=&#34;0.5&#34; opacity=&#34;0.95&#34;/&gt;
      &lt;path id=&#34;legend-arr&#34; d=&#34;M0,0 L-5.88,8.09 A10,10 0 0,1 0,-10 Z&#34; fill=&#34;#440154&#34; stroke=&#34;#000&#34; stroke-width=&#34;0.5&#34; opacity=&#34;0.95&#34;/&gt;
    &lt;/svg&gt;
    &lt;div style=&#34;display:flex;flex-direction:column;line-height:1.2&#34;&gt;
      &lt;span id=&#34;legend-dep-label&#34; style=&#34;color:#c8b900&#34;&gt;Departures&lt;/span&gt;
      &lt;span id=&#34;legend-arr-label&#34; style=&#34;color:#440154&#34;&gt;Arrivals&lt;/span&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;odr-chart&#34;&gt;&lt;/div&gt;
&lt;script src=&#34;https://d3js.org/d3.v7.min.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;subway-network-odr.js&#34;&gt;&lt;/script&gt;
&lt;script&gt;
(async function() {
  const [networkData, boroughs] = await Promise.all([
    fetch(&#34;network_odr_monday.json&#34;).then(r =&gt; r.json()),
    fetch(&#34;boroughs.geojson&#34;).then(r =&gt; r.json())
  ]);

  const chart = createSubwayNetworkODR(d3, networkData, boroughs, {
    width: Math.min(window.innerWidth - 40, 1800),
    height: Math.min(900, Math.max(500, window.innerHeight - 100))
  });

  document.getElementById(&#34;odr-chart&#34;).appendChild(chart.node);
  chart.setThemePageBody(false);

  const states = [&#34;geo&#34;, &#34;volume&#34;, &#34;network&#34;];
  const labels = [&#34;Map&#34;, &#34;Net Flow&#34;, &#34;Network&#34;];
  let stateIdx = 0;
  let theme = &#34;light&#34;;
  let animating = true;
  let animInterval = null;

  const modeBtn = document.getElementById(&#34;mode-btn&#34;);
  modeBtn.addEventListener(&#34;click&#34;, () =&gt; {
    stateIdx = (stateIdx + 1) % states.length;
    modeBtn.textContent = labels[stateIdx];
    chart.update(states[stateIdx]);
  });

  const themeBtn = document.getElementById(&#34;theme-btn&#34;);
  themeBtn.addEventListener(&#34;click&#34;, () =&gt; {
    theme = theme === &#34;light&#34; ? &#34;dark&#34; : &#34;light&#34;;
    chart.setTheme(theme);
  });

  const slider = document.getElementById(&#34;hour-slider&#34;);
  slider.addEventListener(&#34;input&#34;, () =&gt; chart.updateHour(+slider.value));

  const tickLabels = [&#34;12am&#34;,&#34;&#34;,&#34;&#34;,&#34;&#34;,&#34;4am&#34;,&#34;&#34;,&#34;&#34;,&#34;&#34;,&#34;8am&#34;,&#34;&#34;,&#34;&#34;,&#34;&#34;,&#34;12pm&#34;,&#34;&#34;,&#34;&#34;,&#34;&#34;,&#34;4pm&#34;,&#34;&#34;,&#34;&#34;,&#34;&#34;,&#34;8pm&#34;,&#34;&#34;,&#34;&#34;,&#34;11pm&#34;];
  const tickContainer = document.getElementById(&#34;tick-labels&#34;);
  tickLabels.forEach(t =&gt; {
    const span = document.createElement(&#34;span&#34;);
    span.textContent = t;
    span.style.width = &#34;0&#34;;
    span.style.textAlign = &#34;center&#34;;
    span.style.overflow = &#34;visible&#34;;
    span.style.whiteSpace = &#34;nowrap&#34;;
    tickContainer.appendChild(span);
  });

  const playBtn = document.getElementById(&#34;play-btn&#34;);
  const playIconSvg = &#39;&lt;svg width=&#34;12&#34; height=&#34;12&#34; viewBox=&#34;0 0 384 512&#34; fill=&#34;#333&#34;&gt;&lt;path d=&#34;M73 39c-14.8-9.1-33.4-9.4-48.5-.9S0 62.6 0 80V432c0 17.4 9.4 33.4 24.5 41.9s33.7 8.1 48.5-.9L361 297c14.3-8.7 23-24.2 23-41s-8.7-32.2-23-41L73 39z&#34;/&gt;&lt;/svg&gt;&#39;;
  const pauseIconSvg = &#39;&lt;svg width=&#34;12&#34; height=&#34;12&#34; viewBox=&#34;0 0 320 512&#34; fill=&#34;#333&#34;&gt;&lt;path d=&#34;M48 64C21.5 64 0 85.5 0 112V400c0 26.5 21.5 48 48 48H80c26.5 0 48-21.5 48-48V112c0-26.5-21.5-48-48-48H48zm192 0c-26.5 0-48 21.5-48 48V400c0 26.5 21.5 48 48 48h32c26.5 0 48-21.5 48-48V112c0-26.5-21.5-48-48-48H240z&#34;/&gt;&lt;/svg&gt;&#39;;

  function startAnimation() {
    animating = true;
    playBtn.innerHTML = pauseIconSvg;
    animInterval = setInterval(() =&gt; {
      const next = (+slider.value + 1) % 24;
      slider.value = next;
      chart.updateHour(next);
    }, 1000);
  }

  function stopAnimation() {
    animating = false;
    playBtn.innerHTML = playIconSvg;
    if (animInterval) { clearInterval(animInterval); animInterval = null; }
  }

  playBtn.addEventListener(&#34;click&#34;, () =&gt; {
    if (animating) stopAnimation(); else startAnimation();
  });

  chart.setSliderElement(document.getElementById(&#34;odr-hour-slider-wrap&#34;));
  chart.updateHour(8);
  startAnimation();
})();
&lt;/script&gt;
&lt;p&gt;Now, you might reasonably say, Kieran, that&amp;rsquo;s a lot of data to show that people go to work in the morning and come home in the evening. I&amp;rsquo;m not saying there&amp;rsquo;s nothing to that criticism. But there are quite a few interesting details in there as the data pick up traffic to different parts of town. The big interchanges naturally dominate the view, but even here there are things of interest about the balance of flow, as e.g. Penn Station has people coming in on New Jersey Transit during morning rush hour and then entering the subway, which does a lot to balance its net flow during rush-hour and even tip it towards net departures. But more importantly, who doesn&amp;rsquo;t want to sit back and contemplate more than 400 pie charts, each one pulsing with life as another hour ticks by?&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>New York City Hexmaps</title>
      <link>https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/</link>
      <pubDate>Sun, 19 Apr 2026 10:05:54 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/</guid>
      <description>&lt;p&gt;The five boroughs of New York City can be informally or formally carved up into many different pieces, depending on what it is that you&amp;rsquo;re doing. As part of an ongoing project, I recently made an R package, &lt;a href=&#34;https://kjhealy.github.io/nycmaps/&#34;&gt;&lt;code&gt;nycmaps&lt;/code&gt;&lt;/a&gt;, that lets you draw maps of some of these geographies. Things being what they are, these spatial units don&amp;rsquo;t necessarily overlap in compatible ways. City, State, and Congressional  Districts, School Districts, Police Precincts, Fire Companies, Election Precincts, Municipal Court Districts, Zip Codes &amp;hellip; there are loads of them. Some of them are quite straightforward; others patiently lie in wait to trap unwary analysts (I&amp;rsquo;m looking at you, Zip Codes / &lt;a href=&#34;https://www.census.gov/programs-surveys/geography/guidance/geo-areas/zctas.html&#34;&gt;ZCTAs&lt;/a&gt;).&lt;/p&gt;
&lt;h3 id=&#34;mapping-tracts-and-ntas&#34;&gt;Mapping Tracts and NTAs&lt;/h3&gt;
&lt;p&gt;Two classifications of particular interest to people like me are Census Tracts and Neighborhood Tabulation Areas (NTAs). Census Tracts are defined by the Census Bureau and form part of a nested set of geographical units that go from the smallest unit the Census keeps track of (the Block) up to the largest (the whole country). Blocks aggregate to Block groups, Block groups aggregate to Tracts. Tracts aggregate to Counties. There are of course &lt;a href=&#34;https://help.socialexplorer.com/hc/en-us/articles/24930135416733-Census-Geographies&#34;&gt;several complications&lt;/a&gt;. Ideally, &lt;a href=&#34;https://www.census.gov/programs-surveys/geography/about/glossary.html#par_textimage_13&#34;&gt;the Census would like&lt;/a&gt; tracts to be contiguous, sub-county geographical areas with about 4,000 people in them, or at least between 1,200 and 8,000 people. This means tracts can vary considerably in geographical area. They generally follow visible features of the environment, whether physical or built. Uninhabited areas also get tract designations, so in principle we can get a full tract-level map of an area with no gaps. Here&amp;rsquo;s what a tract-level map of New York City looks like:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;ggplot&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nycmaps&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nyc_census_tracts_2020_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;geom_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;boro_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;black&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;linewidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;scale_fill_brewer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;palette&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Set2&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;labs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Borough&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme_void&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-geo-bare.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-geo-bare.png&#34;
         alt=&#34;A tract-level map of NYC with the boroughs colored in.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;2020 NYC Census Tract boundaries.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;You can see the variation in tract size (compare e.g. Staten Island tracts with those in lower Manhattan). And you can also see features that are included on the map but, at least to a first approximation, don&amp;rsquo;t have permanent residents. That big roundy blob in the southeast corner of Queens, for instance, is JFK Airport. There are about 2,300 Census Tracts in New York City. (Naturally, their number and spatial layout changes from decennial census to decennial census, because why should life be easy?)&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.nyc.gov/content/planning/pages/resources/datasets/neighborhood-tabulation&#34;&gt;Neighborhood Tabulation Areas&lt;/a&gt;, meanwhile, are not official Census units. They are one of several subdivisions used by New York City government. The idea is to aggregate tracts into units that roughly correspond to neighborhoods that people conventionally refer to. This is, of course, an impossible task, because people don&amp;rsquo;t agree on neighborhood boundaries. But the idea is good. You want something bigger than a tract because those are small enough to be noisy on many measures produced by the main source of tract-level data, the &lt;a href=&#34;https://www.census.gov/programs-surveys/acs.html&#34;&gt;American Community Survey&lt;/a&gt;. But you want something smaller than the next level up, which is a Community District Tabulation Area. Presently, there are 262 NTAs. They look like this:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;ggplot&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nycmaps&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nyc_nta20_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;geom_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;boro_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;black&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;linewidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;scale_fill_brewer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;palette&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Set2&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;labs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Borough&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme_void&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-geo-bare.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-geo-bare.png&#34;
         alt=&#34;NTA boundaries&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;2020 Neighborhood Tabulation Areas&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;NTAs have recognizable names:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;nycmaps&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nyc_nta20_sf&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;select&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nta2020&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;nta_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;nta_abbrev&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;pre&gt;&lt;code&gt;Simple feature collection with 262 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 913175.1 ymin: 120128.4 xmax: 1067383 ymax: 272844.3
Projected CRS: NAD83 / New York Long Island (ftUS)
First 10 features:
   nta2020                            nta_name nta_abbrev
1   BK0101                          Greenpoint      Grnpt
2   BK0102                        Williamsburg   Wllmsbrg
3   BK0103                  South Williamsburg  SWllmsbrg
4   BK0104                   East Williamsburg  EWllmsbrg
5   BK0201                    Brooklyn Heights      BkHts
6   BK0202 Downtown Brooklyn-DUMBO-Boerum Hill   DwntwnBk
7   BK0203                         Fort Greene      FtGrn
8   BK0204                        Clinton Hill    ClntnHl
9   BK0261                  Brooklyn Navy Yard   BkNvyYrd
10  BK0301           Bedford-Stuyvesant (West)    BdSty_W
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When we have a table like this, we can get tract-level data from the Census, for example on educational attainment, and aggregate it to the NTA level. Then we can join that data to the &lt;a href=&#34;https://r-spatial.github.io/sf/&#34;&gt;simple feature collection&lt;/a&gt; that has our geometries in it. With a little polishing (which you can &lt;a href=&#34;https://socviz.co&#34;&gt;read all about in what I personally think of as a very useful book&lt;/a&gt;), we get something like this:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-geo-ba.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-geo-ba.png&#34;
         alt=&#34;BA degrees or higher within NTAs, ACS 5-year estimates.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;BA degrees or higher within NTAs.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Nominally zero-population NTAs get grayed out (JFK, LGA, various parks and cemeteries, Brooklyn Navy Yards, the United Nations, etc).&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s what this data looks like at the tract level:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-geo-ba.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-geo-ba.png&#34;
         alt=&#34;BA degrees or higher within tracts, ACS 5-year estimates.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;BA degrees or higher within tracts.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;h3 id=&#34;hexgrids&#34;&gt;Hexgrids&lt;/h3&gt;
&lt;p&gt;Sometimes we want a more schematic representation of geographies, because &lt;a href=&#34;https://socviz.co/07-maps.html#americas-ur-choropleths&#34;&gt;choropleth maps can be tricky to work with&lt;/a&gt;. There are &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/&#34;&gt;many&lt;/a&gt; &lt;a href=&#34;https://kieranhealy.org/blog/archives/2023/12/06/dorling-cartograms/&#34;&gt;possibilities&lt;/a&gt; here, including not drawing maps at all. One option is to make a kind of cartogram by turning our map polygons into a tessellated grid where each unit gets a single tile. My &lt;a href=&#34;https://kjhealy.github.io/nychex/&#34;&gt;&lt;code&gt;nychex&lt;/code&gt; package&lt;/a&gt; provides hexagonal and square tilings for NTAs and tracts in New York City. Turning geographically accurate polygons into regular tiled grids can be a bit tricky, especially when the polygons you are trying to tile contain &amp;ldquo;holes&amp;rdquo;. But thanks to the &lt;a href=&#34;https://kaerosen.github.io/tilemaps/&#34;&gt;&lt;code&gt;tilemaps&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;http://andyteucher.ca/rmapshaper/&#34;&gt;&lt;code&gt;rmapshaper&lt;/code&gt;&lt;/a&gt; packages we can get reasonably far in a semi-automated way, and then tweak things by manually nudging tiles around. Our baseline NTA hexmap looks like this:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;ggplot&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nychex&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;nyc_nta20_hex_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;geom_sf&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;boro_name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;black&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;linewidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.3&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;scale_fill_brewer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;palette&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Set2&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;labs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Borough&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme_void&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-hex-bare.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-hex-bare.png&#34;
         alt=&#34;Bare NTA hexmap&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;The opening gameboard for my upcoming strategy game, &lt;em&gt;Ticket to Ridgewood&lt;/em&gt;&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;And here is what our BA prevalence map looks like when mapped with it:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-hex-ba.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-nta-hex-ba.png&#34;
         alt=&#34;BA prevalence, NTA hexmap edition.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Labeled NTA hexmap&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Here I&amp;rsquo;ve labeled the hexes with the (sometimes highly) abbreviated version of their NTA name. Maps like this don&amp;rsquo;t magically solve the difficulties of spatially representing population-based data, but they have their uses. They can be handy if you want to quickly get a sense of variation across units while retaining a roughly spatial layout. They also make some kinds of small-multiple or faceted plots a little easier.&lt;/p&gt;
&lt;p&gt;We don&amp;rsquo;t have to stop at NTA-level resolution. We can do a tract-level one, too. Here&amp;rsquo;s a tract-level base hexmap:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-hex-bare.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-hex-bare.png&#34;
         alt=&#34;Bare tract-level hexmap&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Ticket to Ridgewood, advanced edition&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This map makes a series of compromises with city geography in order to make room for each tract&amp;rsquo;s hexagon. In particular, northern Manhattan is more detached from the Bronx than is ideal, and it was also necessary to sever Brooklyn from Queens along their shared border. (Our simplify-and-tile algorithm had a very hard time with the undifferentiated Brooklyn/Queens landmass.) With a fully hand-drawn hexmap we could probably avoid most of these problems, but that would involve quite a substantial amount of work. Even when we generate the main components algorithmically, quite a bit of hand-adjustment in the overall positioning and layout is still required (especially for things like the Rockaways and other islands or quasi-islands). Sadly there&amp;rsquo;s no magic way to integrate the main borough polygons while preserving the orientation of all the hexes. Still, the result isn&amp;rsquo;t bad.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s our BA map in tract-level hexagonal form:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-hex-ba.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/04/19/new-york-city-hexmaps/nychex-ct-hex-ba.png&#34;
         alt=&#34;Tract-level BA hexmap&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Tract-level BA map.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The usual benefits and disadvantages of regularized choropleths are in evidence here. Lower Manhattan&amp;rsquo;s population gets a fairer shout, as do parts of Brooklyn and the Bronx. The geography still mostly works. The difficulties flow mostly from the map being at the tract-level in the first place, rather than the tiling. That is, some tracts have unusual shapes that result in quite noisy estimates. For example, sometimes a tract will consist &lt;em&gt;mostly&lt;/em&gt; of a park or an industrial area, but have a few residential segments. This can mean it ends up being measured too high or too low on the thing we&amp;rsquo;re counting. Or, by contrast, a tract might be almost entirely one kind of entity, like a retirement home, producing results that might seem odd if you don&amp;rsquo;t know what&amp;rsquo;s in that spot. You can see why the City aims at the NTA level for a lot of its summaries. It has ten times fewer units, but things get smoothed out in a way that may be more useful. Any real-world method of measurement comes with some rate of error, which the Census helpfully provides estimates of. Nice maps tempt you to reify observations and spin yarns about what you see, whether it&amp;rsquo;s a finely-detailed spatial polygon or a pleasingly regular hexagon. The finer the observational grain, the more important it is for you to know about the situation on the ground. Literally.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Subway Sign</title>
      <link>https://kieranhealy.org/blog/archives/2026/03/28/subway-sign/</link>
      <pubDate>Sat, 28 Mar 2026 14:25:52 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/03/28/subway-sign/</guid>
      <description>&lt;p&gt;After the &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/13/parking-signs/&#34;&gt;parking signs&lt;/a&gt; last time, here is a subway sign.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/28/subway-sign/no-king-queens-subway.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/28/subway-sign/no-king-queens-subway.png&#34;
         alt=&#34;Subway sign with Queens-bound trains, headed No Kings&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A borough-specific sign to display&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Here&amp;rsquo;s a direct link to the &lt;a href=&#34;no-king-queens-subway.png&#34;&gt;PNG&lt;/a&gt; and the &lt;a href=&#34;no-king-queens-subway.pdf&#34;&gt;PDF&lt;/a&gt;. Once again, put it onna stick and exercise the constitutional rights to freedom of expression, speech, and assembly enjoyed by &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/06/28/american/&#34;&gt;everyone in the United States&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Confessors in Harmony</title>
      <link>https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/</link>
      <pubDate>Mon, 23 Mar 2026 10:54:57 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Charles_Fourier&#34;&gt;Charles Fourier&lt;/a&gt; was an early socialist utopian, part of the French tradition of thinkers who came up with various schemes for the complete reorganization of society on more rational grounds, and whose views now read to us as an unstable admixture of obviously sensible notions, delusional crankery, and things that seem to encapsulate both of those elements at once in a way that brings out the weirder aspects of our own dominant forms of social organization. Fourierist communities, known as &lt;a href=&#34;https://en.wikipedia.org/wiki/Phalanst%C3%A8re&#34;&gt;phalansteries&lt;/a&gt;, got off the ground in several countries, including a few in the United States. The basic social unit he had in mind was called a &lt;em&gt;phalanx&lt;/em&gt;; in his mind, this would consist of 1,620 people, because of course there are 810 types of personality and you need two of each kind. A phalanstery is thus the large housing complex that houses a phalanx. &lt;a href=&#34;https://en.wikipedia.org/wiki/Horace_Greeley&#34;&gt;Horace Greeley&lt;/a&gt; founded a couple, including the Sylvania Colony in Pennsylvania that eventually became the town of &lt;a href=&#34;https://en.wikipedia.org/wiki/Greeley,_Pennsylvania&#34;&gt;Greeley, PA&lt;/a&gt;.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/phalanstere.jpg&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/phalanstere.jpg&#34;
         alt=&#34;A diagram of an ideal phalanstery.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A phalanstery.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Reading Fourier in his context is of course a matter of careful intellectual
history, which we are going to skip here. I just wanted to make a note of a
passage I came across this morning where Fourier discusses aspects of his plans
for relations between the sexes. Fourier had many interesting views here that
flow from his rejection of the constraints of &amp;ldquo;civilization&amp;rdquo;.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; He coined the
term &amp;ldquo;feminism&amp;rdquo;. His vision was of a kind of liberation of human creative
energies and capacities of all kinds, from work to sex, that would be organized
in a manner at once fulfilling to individuals and socially productive and
harmonious. As is often the case in utopian visions, there&amp;rsquo;s a desire to have
all of the benefits of an advanced division of labor while avoiding its overall
problems and rejecting the social mechanisms we seem to have to coordinate it
(which is to say, usually, an oppressive state or an open market). French
utopian socialism, in particular, often ends up with schemes that are at once
clearly hierarchical, or at least occupationally differentiated, highly
integrated and solidaristic, but also fully accommodating of individual freedoms
and liberties. A magic wand that grants some theory of human nature, or posits a
general conversion to some sort of religion of science, is often waved by the
author to accomplish the hard parts.&lt;/p&gt;
&lt;p&gt;A side effect of this line of thought is that the twin beliefs in rational
administration of the social order and the harmonious satisfaction of individual
needs and desires can result in ideas that seem strangely familiar to us. For
example, in this passage Fourier is discussing the problem of how people should
pair up in phalansteries, and in particular how a traveler from one settlement
to another should be able to make the best use of their time in matters of love.
His answer is that we clearly need some sort of organized matching system that
optimizes on the combination of people&amp;rsquo;s present sexual wants and their basic
personality types:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;We will now take up one of the most interesting branches of the calculus of
the passions: the art of enabling anyone anywhere, even in places where he is
a total stranger, to make instant contact with people with whom he is in
complete sympathy. If the theory of attraction offered no other advantage,
would it not still be a boon to all mankind? Would it not be a blessing to the
people of civilization who often spend years in a city without encountering
sympathetic partners in love, friendship or any of the passions? In Harmony
any traveler will make such acquaintances on the very day of his arrival in a
city. &amp;hellip; The art of the &lt;em&gt;sympathist&lt;/em&gt;, which is unknown in civilization,
provides the means for the instant matching of personalities and sympathies,
anywhere and under any circumstances.&lt;/p&gt;
&lt;p&gt;Let us first consider the enormity of the calculations entailed by sympathetic
matching and the speed with which they must be performed. &amp;hellip; Let us suppose
that a horde of one thousand adventurers and adventuresses has arrived in a
Phalanx at four o&amp;rsquo;clock in the afternoon. They are immediately served light
refreshments and then, even before they take time to wash, they rush off to
confession. The most skillful confessors and confessoresses of the region have
been gathered. Their job is to examine these thousand knights errant, each of
whom must submit a written declaration concerning his or her most recent
adventures. The confessors go over these declarations as well as the reports
of previous confessors; they study the immediate inclinations and physical
needs of each individual, and attempt to provide him or her with appropriate
sympathetic relationships. Putting all the relevant information together, the
confessors determine, by means of an equation, the balance of contrasts and
identities that will be most attractive to each of the adventurers.&lt;/p&gt;
&lt;p&gt;When an individual is still basking in the enthusiasm of a romantic passion,
the delicious contrast provided by a sympathy in the composite mode, should
the confessor intervene to provide a diversion? Yes, very likely. For someone
who has just concluded a sympathetic relationship in the composite would
probably be incapable of duplicating the experience immediately. Then should
the confessor provide the visitor with a cabalistic liaison by offering him a
choice among several candidates of identical character? Or should an appeal be
made to the visitor&amp;rsquo;s alternating sympathies, his penchant for variety and
contrast in both the moral and physical realms? Such are the first questions
considered by the confessor. For there are three basic types of sympathies,
and the initial problem is to decide which should be employed. A variation may
be in order. Or it may be advisable to continue on the same scale of
sympathies. For if an individual&amp;rsquo;s appetite has not been exhausted by his
previous adventures, he will be capable of engaging the same sympathies in a
higher or lower degree.&lt;/p&gt;
&lt;p&gt;That is the heart of the matter; now let us turn to subordinate problems. No
matter which of the three sympathies is brought into play, should it be
presented directly or should it be preceded by transitions or complements? In
the latter case, should the individual be subjected to a direct unitary
sympathy or to an inverse unitary sympathy or perhaps even a diffracted
sympathy? Should simple movement be relied upon?&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; This is sometimes a wise
course of action, though rather inglorious, as simple movement always is; but
it may be necessary in exceptional cases which will be determined by the
confessors and the fairies. It is not for the individual to choose between
these alternatives. He is too much absorbed by his recent memories. It is up
to the calm and judicious confessor to determine the type of charm that will
arouse his enthusiasm. On the basis of his declarations the confessor will
decide what sort of relationship is most certain to engage his sympathies.
Then the fairies with the best knowledge of the Phalanx will designate the
matching individuals from whom he may make his choice.&lt;/p&gt;
&lt;p&gt;All of this work, which involves the use of algebraic formulas, should be
completed for the thousand adventurers within the short time of two or three
hours. For if each visitor was not systematically informed concerning his
sympathies, if there was no one to inform him about the individuals with whom
he could instantaneously establish sympathetic relationships, he would run the
risk of getting involved in purely sensual intrigues, intrigues wholly lacking
in illusion. Like the people of civilization, he would fall back upon trivial,
simple sex. (Sometimes this is necessary, but only as a respite from
composite relationships or, as a transition, in moments of hesitation and of
overabundant pleasure.) After a two days&amp;rsquo; visit, at the moment of his
departure, he might accidentally encounter people with whom he was truly
sympathetic. Then he would regret having spent his two days without having
known their merit and without having formed a liaison which might have charmed
him. He would leave the Phalanx with a feeling of resentment, with bad
memories instead of delicious illusions.&lt;/p&gt;
&lt;p&gt;This goes to show that the functions of the confessor are most important and
that a skilled confessor is an invaluable member of any Phalanx. The job is
not one that can be confided to the first comer, as it is in civilization; it
requires the greatest degree of tact, human understanding, and familiarity
with local circumstances. Women will excel more than men in this sort of work,
and as a rule in Harmony there will be two confessoresses for every one
confessor. They will be magnificently paid for their services and promoted to
the highest ranks. As for the formulas used by the confessors in their work of
sympathetic matching, they cannot be explained just yet. It will first be
necessary to classify the 810 personality types, any one of which may be
represented in an equation of sympathy and may appear in a number of cases.&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;This translation is from &lt;em&gt;The Utopian Vision of Charles Fourier&lt;/em&gt;, edited by
Jonathan Beecher and Richard Bienvenu (Beacon, 1971, pp. 378&amp;ndash;80). They note, somewhat dryly,
that &amp;ldquo;Fourier does in fact go on to explain&amp;mdash;at length&amp;mdash;the process whereby
the confessors do their matching. Unfortunately, his &amp;lsquo;algebraic&amp;rsquo; formulas do not
lend themselves to translation.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s a taste of that, from volume eleven of &lt;em&gt;Oeuvres complètes de Charles Fourier&lt;/em&gt;:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/fourier-formula.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/23/confessors-in-harmony/fourier-formula.png&#34;
         alt=&#34;Fourier&amp;#39;s matching system.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;I wouldn&amp;rsquo;t rely on it myself.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;In any case, I like the idea of a full filing system containing an information
card for each person, classified according to a system of 810 personality types,
and including regularly updated information about needs and recent activities,
curated and administered by experts, and which people consult upon arriving in a
new town or city when they are considering how best to match up with someone it
would be enjoyable to hook up with. Obviously a completely unrealistic and
impossible-to-implement scheme that bears no resemblance whatsoever to how
&lt;a href=&#34;https://theordinalsociety.com&#34;&gt;anything in contemporary society works&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Always a derogatory term for Fourier, in contrast with &amp;ldquo;Harmony&amp;rdquo;, the overall name for his utopia.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Here he means what is in his view the simplest kind of attraction, that of opposites.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Using Quarto to Write a Book</title>
      <link>https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/</link>
      <pubDate>Mon, 09 Mar 2026 09:34:53 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/</guid>
      <description>&lt;p&gt;I&amp;rsquo;ve spent the last couple of months revising my &lt;a href=&#34;https://press.princeton.edu/books/hardcover/9780691181615/data-visualization&#34;&gt;Data Visualization book&lt;/a&gt; for a second edition that, ideally, will appear some time in the next twelve months. As with the first edition, I&amp;rsquo;ve posted a &lt;a href=&#34;https://socviz.co&#34;&gt;complete draft of the book&lt;/a&gt; at its website. The production process hasn&amp;rsquo;t started yet, so it&amp;rsquo;s not ready to pre-order or anything, but the site has a one-question &lt;a href=&#34;https://forms.gle/4xeALwJLbzdzT8rz7&#34;&gt;form you can fill out&lt;/a&gt; that asks for your email address if you&amp;rsquo;d like to be notified with one (and only one) email when it&amp;rsquo;s available. A lot has changed since the first edition, reflecting changes both in R and ggplot specifically, and in the world of coding generally. I may end up highlighting some of those new elements in other posts. But here, I want to focus on some nerdy details involved in getting the book to its final draft. I&amp;rsquo;ll discuss &lt;a href=&#34;https://quarto.org&#34;&gt;Quarto&lt;/a&gt;, the publishing system I used, its many advantages, and its current limits with respect to the demands I made of it.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-distributions-page-detail.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-distributions-page-detail.png&#34;
         alt=&#34;Page detail from the draft book.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A detail from facing pages in Chapter 4, in the PDF version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The book is about doing good data visualization using &lt;a href=&#34;https://www.r-project.org&#34;&gt;R&lt;/a&gt; and &lt;a href=&#34;https://ggplot2.tidyverse.org&#34;&gt;ggplot&lt;/a&gt;. The book contains many figures, almost all of which are written using the code the book shows and explains.&lt;/p&gt;
&lt;h3 id=&#34;reasonable-demands&#34;&gt;Reasonable Demands&lt;/h3&gt;
&lt;p&gt;My baseline list of requirements for the book manuscript was as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The entire text of the book is in some kind of plain-text format.&lt;/li&gt;
&lt;li&gt;Figures in the book that are the result of R code should be directly produced by R code in the actual document; no cutting and pasting of code snippets and separately-produced figures. Doing that is a recipe for error.&lt;/li&gt;
&lt;li&gt;The scholarly machinery of the book&amp;mdash;chapter, section, table, and figure numbering; cross-references; in-text bibilographical references; the bibliography itself and its formatting, and so on&amp;mdash;should be automatically handled. No manual numbering and renumbering of figures, etc.&lt;/li&gt;
&lt;li&gt;It should be straightforward to repeatedly generate a fully-formatted and laid-out version of the book manuscript as I go, ideally in any of several output formats (e.g. PDF, HTML), despite it all being written in plain text.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These requirements are reasonable because, for projects like this, working in &lt;a href=&#34;https://plain-text.co&#34;&gt;plain text is the right thing to do&lt;/a&gt;. We are writing and revising text and our code; we keep the text in a version control system; we don&amp;rsquo;t want the results of the code to come apart from the code that generated it; and we need to deliver outputs that consist both of fully-formatted material and replication packages that allow other people do see what we did. PDF is of course &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/02/06/kerning-and-kerning-in-a-widening-gyre/&#34;&gt;the worst&lt;/a&gt;, but we still need to target it as one of our output formats.&lt;/p&gt;
&lt;p&gt;Despite being reasonable, these requirements are in truth quite demanding. Once you start thinking about what all the pieces entail you realize there&amp;rsquo;s a &lt;em&gt;lot&lt;/em&gt; to keep track of. Systems for doing some or all of this have been developed in whole or in part over the years. Newer ones sometimes escape the constraints of older ones; sometimes they inherit their legacies. I&amp;rsquo;m not going to review them here. This time around I used &lt;a href=&#34;https://quarto.org&#34;&gt;Quarto&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Quarto is a publishing system focused on documents of different kinds (articles,
presentations, books, websites), written as plain-text sources that mix prose
and code in any of several languages (R, Python, Julia, others), destined to be
fully-finished outputs in any of several formats (PDFs, HTML, or Word files).
Quarto builds on and extends many tools, notably &lt;a href=&#34;https://pandoc.org&#34;&gt;pandoc&lt;/a&gt;
for getting from Markdown to any number of other output formats. It&amp;rsquo;s a
spiritual descendant of &lt;a href=&#34;https://en.wikipedia.org/wiki/Literate_programming&#34;&gt;literate
programming&lt;/a&gt; approaches for
dealing with code that needs to be run in the context of prose. In the R world
these descendants include
&lt;a href=&#34;https://cran.r-project.org/doc/manuals/r-patched/packages/utils/vignettes/Sweave.pdf&#34;&gt;Sweave&lt;/a&gt;
and &lt;a href=&#34;https://yihui.org/knitr/&#34;&gt;RMarkdown/knitr&lt;/a&gt;. These broadly &amp;ldquo;notebook&amp;rdquo;
approaches to writing and discussing code have benefits and also sharp
limits if your focus is full-on software development and its documentation, or
complex data analysis involving many interrelated steps.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; But they&amp;rsquo;re &lt;em&gt;very&lt;/em&gt; useful
if you are primarily writing longer-form text that periodically requires things
like figures and tables to be programatically generated in a reproducible
fashion.&lt;/p&gt;
&lt;p&gt;If you just want to know whether you can write long-form projects like articles, books, or websites using Quarto and R, the answer is absolutely yes. A long time ago I wrote parts of my dissertation and several articles using Sweave. A few years ago I wrote the first edition of &lt;em&gt;Data Visualization&lt;/em&gt; using RMarkdown. I wrote the second edition using Quarto. Each one was better than the previous version in terms of flexibility and power. Quarto eliminated several pain-points that I had to deal with for the first edition of this book. It&amp;rsquo;s very &lt;a href=&#34;https://quarto.org/docs/guide/&#34;&gt;well-documented&lt;/a&gt; and continually improving. Its defaults are sensible and produce &lt;a href=&#34;https://quarto.org/docs/gallery/&#34;&gt;good-looking output&lt;/a&gt;. You can stop reading now.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/workflow-wide-quarto.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/workflow-wide-quarto.png&#34;
         alt=&#34;A schematic overview of how Quarto orchestrates its document processing.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A schematic overview of how Quarto orchestrates its document processing.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;h3 id=&#34;unreasonable-demands&#34;&gt;Unreasonable Demands&lt;/h3&gt;
&lt;p&gt;I had a very clear idea about how I wanted the first edition of the book to look
in print. I also knew that I wanted to make it available as a website. I was
fortunate enough to be able to have both of these things work out. This time
around, I did the same again but I really wanted there to be as little as
possible &lt;em&gt;post hoc&lt;/em&gt; work with the website version. I knew that wouldn&amp;rsquo;t be the
case with the PDF, for reasons I will discuss in a moment. I&amp;rsquo;m pleased that
Quarto performed so well with the whole process. I wrote two pretty
heavily-customized output formats (one for PDF and one for HTML) that specified
the layout of the book. Quarto&amp;rsquo;s LaTeX-based book pipeline uses the &lt;a href=&#34;https://ctan.org/pkg/scrbook?lang=en&#34;&gt;&lt;code&gt;scrbook&lt;/code&gt; class&lt;/a&gt; from the &lt;a href=&#34;https://ctan.org/pkg/koma-script?lang=en&#34;&gt;KOMA-script&lt;/a&gt; bundle, which has many nice features, though I find its documentation a tiny bit eccentric. (This might be because I wrote my first book using &lt;a href=&#34;https://ctan.org/pkg/memoir?lang=en&#34;&gt;the &lt;code&gt;memoir&lt;/code&gt; class&lt;/a&gt;.)  I also wrote a couple of R packages that
managed the themes and some other details of how PNG and &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/02/06/kerning-and-kerning-in-a-widening-gyre/&#34;&gt;especially PDF&lt;/a&gt; figures
were produced. A version of the theme is in the development version of the
&lt;a href=&#34;https://kjhealy.github.io/socviz/&#34;&gt;&lt;code&gt;socviz&lt;/code&gt; package&lt;/a&gt; that accompanies the
book.&lt;/p&gt;
&lt;p&gt;The PDF design is a two-column &amp;ldquo;Tufte-style&amp;rdquo; layout with wide margins for side-notes and figures. It works very well for a book of this kind as we can show small figures alongside the code that generates them, but also have figures break out of the main text column if needed.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-halloween-page.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-halloween-page.png&#34;
         alt=&#34;Facing pages with a figure that runs the full width of one of the pages.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Facing pages with a figure that runs the full width of one of the pages.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;A layout like this can&amp;rsquo;t be rigidly ported over to a website, especially in an era of widely-varying screen sizes. So the HTML version of the book has a broadly responsive layout that arranges things differently at different sizes. Organizing and tweaking it this time around was made a lot easier by Quarto&amp;rsquo;s much better support for margin notes and marginal figures. It certainly wasn&amp;rsquo;t without its headaches. Marginal figures and notes are quite annoying to deal with in both HTML and PDF formats, for different reasons. In the PDF case, it&amp;rsquo;s tricky to get captions right, and there are still a few hacks in there to make it work. But it&amp;rsquo;s &lt;em&gt;much&lt;/em&gt; cleaner than what I had to do in RMarkdown for the first edition, which was in effect a lot of regular expression substitution for things I could only add after the &lt;code&gt;.tex&lt;/code&gt; file was produced. That&amp;rsquo;s gone now.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s a screenshot of a facing page layout with some code, some marginal notes, and two kinds of figures, one in the margin and one full page-width:&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-gdppercap-page.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-gdppercap-page.png&#34;
         alt=&#34;Gapminder figures in the PDF version.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Gapminder figures in the PDF version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;And here&amp;rsquo;s some of the same material as seen on the website:&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-gdppercap-web.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv2-gdppercap-web.png&#34;
         alt=&#34;Gapminder figures in the HTML version&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Gapminder figures in the HTML version&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;&lt;a href=&#34;https://socviz.co/04-group-facet-transform.html#facet-to-make-small-multiples&#34;&gt;Here&amp;rsquo;s a direct link to the same section.&lt;/a&gt; In the website version the marginal figures appear more marginal. There&amp;rsquo;s also a little bit of conflict to be worked out between the navigation guides and the marginal notes. In addition, the intrinsic variability of the web layout means that the positioning of the marginal notes is less precisely controllable than it is in the PDF output. But the overall result is pretty good. And I have to say it&amp;rsquo;s very satisfying to be able to produce a good website and a clean PDF (and also an ePub!) from the same folder of &lt;code&gt;qmd&lt;/code&gt; files, with the text written in &lt;a href=&#34;https://daringfireball.net/projects/markdown/&#34;&gt;Markdown&lt;/a&gt;, the bibliography managed by &lt;a href=&#34;https://www.zotero.org&#34;&gt;Zotero&lt;/a&gt; and &lt;a href=&#34;https://retorque.re/zotero-better-bibtex/&#34;&gt;BBT&lt;/a&gt;, interspersed with the code that makes all the figures.&lt;/p&gt;
&lt;h3 id=&#34;let-the-professionals-do-a-professional-job&#34;&gt;Let the Professionals do a Professional Job&lt;/h3&gt;
&lt;p&gt;I should say &amp;ldquo;less precisely controllable &lt;em&gt;without substantial further adjustment&lt;/em&gt;&amp;rdquo;. Because this is the crux of the customization biscuit. There&amp;rsquo;s no end to it. One of the benefits of being in a position to do a second edition&amp;mdash;something I really am very grateful for&amp;mdash;is that it allowed me to have a much better sense of the production process for the hard copy of the book. This in turn placed sharp limits on what I was willing to do when it came to customizing the PDF version myself. Camera-ready files for books published by proper Presses are produced in many different ways. My &lt;a href=&#34;https://theordinalsociety.com&#34;&gt;most recent book&lt;/a&gt;, which is all prose and no code, was designed and typeset using &lt;a href=&#34;https://en.wikipedia.org/wiki/Adobe_InDesign&#34;&gt;Adobe InDesign&lt;/a&gt;. For the first edition of &lt;em&gt;Data Visualization&lt;/em&gt; I sent the Press a set of LaTeX files and PDF image assets. The LaTeX files produced a very good facsimile of the design we&amp;rsquo;d agreed on. Then the Press&amp;rsquo;s typesetter got to work on it.&lt;/p&gt;
&lt;p&gt;You might think that they just took my files, lightly edited them here and there, and added the trim, bleed, registration, and color marks for the physical print job.  That&amp;rsquo;s not how it went. Book layouts are very hard to get just right, especially layouts that have many different-sized images and notes and other paraphernalia. They&amp;rsquo;re fragile. Moving something slightly here, or editing a sentence there, can cause a cascade of unwanted effects. Even ordinary pages of text will have issues with excessive or insufficient spacing around paragraph and section breaks, or &lt;a href=&#34;https://en.wikipedia.org/wiki/Widows_and_orphans&#34;&gt;widows and orphans&lt;/a&gt;, or &lt;a href=&#34;https://en.wikipedia.org/wiki/River_(typography)&#34;&gt;rivers&lt;/a&gt;, and many other infelicities that most people won&amp;rsquo;t notice explicitly, but which cumulatively convey bad vibes even to people who don&amp;rsquo;t much care about design.&lt;/p&gt;
&lt;p&gt;Some of this can be automated. That&amp;rsquo;s what layout algorithms do. The &lt;a href=&#34;https://en.wikipedia.org/wiki/Knuth%E2%80%93Plass_line-breaking_algorithm&#34;&gt;Knuth-Plass box-and-glue algorithm&lt;/a&gt;, which is the thing that causes TeX to emit those &lt;code&gt;Underfull \hbox (badness 10000)&lt;/code&gt; complaints, is a real marvel. But it can&amp;rsquo;t quite work miracles. In my case, the professional typesetter took my LaTeX file, threw away my document class and substituted their own custom one (and some custom style files). Like any document class it defined the layout and all the features of the book, but it also included a variety of commands that allowed her to finely adjust the text as needed on any particular page. Tightening up the spacing here; forcing a break there; very slightly expanding or contracting the page size when needed to make sure that the layout didn&amp;rsquo;t break in a visible way on the next page, and so on. Here&amp;rsquo;s an example from the first edition:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv-tweaks-1.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv-tweaks-1.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;And another:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv-tweaks-2.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/09/using-quarto-to-write-a-book/dv-tweaks-2.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;Those uses of &lt;code&gt;{\break}&lt;/code&gt;, &lt;code&gt;\enlargethispage&lt;/code&gt;, &lt;code&gt;\vspace{}&lt;/code&gt;, and the non-breaking space in &lt;code&gt;this~way&lt;/code&gt; are all done by hand, based on rendering and re-rendering the document as it&amp;rsquo;s built to make sure each page meets the Press&amp;rsquo;s standards. An automatically-produced PDF can get you eighty five or ninety percent of the way to this but, if you really want to get things right, that last stretch will inevitably mean a bunch of adjustment by hand in whatever the final format is. That&amp;rsquo;s not something you can incorporate into your reproducibility pipeline.&lt;/p&gt;
&lt;p&gt;Fortunately, you don&amp;rsquo;t need to. Most of the time we don&amp;rsquo;t require anything like that level of attention to detail. It&amp;rsquo;s worth producing and circulating material in accessible and readable formats that also don&amp;rsquo;t look like garbage. And it&amp;rsquo;s gratifying to be able to reliably generate pretty high-quality versions of those outputs from plain-text sources. That&amp;rsquo;s more than good enough in almost all cases. When writing papers that end up as PDFs, for example, I use a template that&amp;rsquo;s almost 20 years old. I only touch it when something breaks. By the same token, while my amateur interests compel me to run up polished custom Quarto book formats for my book projects, I also know that the people who set type for a living know a lot more about the fine grain of that work than I know, or need to know. But once in a while it&amp;rsquo;s nice to see how far you can push things.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;The trick is to have the code chunks in your document be short and sweet, and have structured scripts and properly-documented packages manage the heavy lifting in any analysis.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Data Visualization, Second Edition</title>
      <link>https://kieranhealy.org/blog/archives/2026/03/06/data-visualization-second-edition/</link>
      <pubDate>Fri, 06 Mar 2026 06:52:36 -0500</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2026/03/06/data-visualization-second-edition/</guid>
      <description>&lt;p&gt;I&amp;rsquo;ve written a second edition of &lt;a href=&#34;https://socviz.co&#34;&gt;&lt;em&gt;Data Visualization: A Practical Introduction&lt;/em&gt;&lt;/a&gt;, which ideally should come out with Princeton University Press later this year. As with the first edition, a full draft of the book is available at &lt;a href=&#34;https://socviz.co&#34;&gt;https://socviz.co&lt;/a&gt;. The production process is just getting started so there&amp;rsquo;s no new cover yet, and there isn&amp;rsquo;t a link to pre-order. But (also like last time) I&amp;rsquo;ve put up a link to a &lt;a href=&#34;https://forms.gle/4xeALwJLbzdzT8rz7&#34;&gt;form&lt;/a&gt; that lets you add your email if you&amp;rsquo;d like to be notified when it&amp;rsquo;s available to buy. You&amp;rsquo;ll only get one email (from me personally, not a marketing department) if you do; no spam or anything.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2026/03/06/data-visualization-second-edition/global_mean_simple.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2026/03/06/data-visualization-second-edition/global_mean_simple.png&#34;
         alt=&#34;Global Mean Sea Surface Temperatures&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;The revised edition is a pretty thorough rewrite. Naturally all the code is brought up to date for ggplot 4 and R version 4.5 and higher. The code from the first edition still runs, but you&amp;rsquo;ll get warnings and so on; those are all now gone. The back half of the book has been pretty thoroughly redone to reflect big changes in the availability of software for maps, (the &lt;code&gt;sf&lt;/code&gt; package) and extracting results from models (the &lt;code&gt;marginaleffects&lt;/code&gt; package). Meanwhile, several years of teaching this material (and getting feedback from others) have resulted in shifts of emphasis here and there to introduce just a little bit more on data wrangling. As the book goes on I also shift from an &amp;ldquo;object-based&amp;rdquo; approach to writing plots to a more &amp;ldquo;pipeline-based&amp;rdquo; one.&lt;/p&gt;
&lt;p&gt;The recent rise of LLMs and coding agents gets some discussion, too. There the question is &amp;ldquo;Why can&amp;rsquo;t I just have a robot write all the code for me?&amp;rdquo; I don&amp;rsquo;t dismiss this question out of hand, and I don&amp;rsquo;t pretend that agents aren&amp;rsquo;t very powerful. My feeling about this is summed up in the &lt;a href=&#34;https://socviz.co/#whats-new-in-this-edition&#34;&gt;Preface&lt;/a&gt;:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;Perhaps you have a robot to help you write your code now. Large Language Models (LLMs) and coding agents are now part of the workflow of code generation and evaluation. They can do a great deal; so much so that it might seem superfluous to spend any time with the iterative, write-try-redo approach to visualization that this book presents. Can’t the robot write all the code instead? Not quite. It’s not that I believe repeatedly doing repetitive and error-prone tasks yourself is a virtue. To the contrary, that’s what computers are for. This book is full of examples where we end up automating something in order not to worry about it. But I also want you, the reader, to learn how to do good graphical work in a reproducible way. That means having a keen eye for quality and a good nose for error. Cultivating those senses requires practice and a vocabulary to express them. It seems faintly absurd to have to say explicitly but, whatever tools you use, your work will be better if you know what you are doing and understand why you are doing it. This book teaches you ggplot specifically, but it is not trying to lock you in to a particular framework. It’s just that, the way you acquire a general skill or a wide-ranging taste is by first learning some more specific version of those things, and then practicing them. Automation can come a later. In the words of the author Ann Leckie, you don’t learn how to do something by not doing it. For that reason, this book remains a hands-on introduction.&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;Or to put it another way, the book is an introduction to how to do something. One feature of books like it is that they tend to have two audiences: people who don&amp;rsquo;t know anything about the topic, and who&amp;rsquo;d like to learn something about it, and people who know a &lt;em&gt;lot&lt;/em&gt;, at least in relative terms, and who have forgotten what it&amp;rsquo;s like not to know it.  When the first edition came out, one of the early Amazon reviews was a complaint that the book seemed &amp;ldquo;pretty introductory&amp;rdquo; in its content. I mean, my Brother in Christ, that is right there in the title.&lt;/p&gt;
&lt;p&gt;As with any corner of the vast division of labor that is human society, not everyone has to know about any specific thing in great detail. We&amp;rsquo;re all taking huge amounts of stuff for granted at any moment. But if you want to be proficient in some piece of that enormous web, it&amp;rsquo;s better that you know rather than not know what&amp;rsquo;s what. There&amp;rsquo;s nothing wrong with using tools that give you tremendous leverage. You do it every time you use a stand mixer in the kitchen, or a sander in the garage. You do it every time you turn your computer on, in fact. But you still need to develop the capacity to tell good work from bad, or correct from incorrect output, or safe uses from dangerous ones. That way you can take advantage of the power tools without being at risk of slicing your own or anyone else&amp;rsquo;s arm off.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Ordinal Exchanges</title>
      <link>https://kieranhealy.org/blog/archives/2025/11/13/ordinal-exchanges/</link>
      <pubDate>Thu, 13 Nov 2025 12:52:40 -0500</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/11/13/ordinal-exchanges/</guid>
      <description>&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/13/ordinal-exchanges/tos-detail.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/11/13/ordinal-exchanges/tos-detail.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;The &lt;em&gt;Socio-Economic Review&lt;/em&gt; has published a &lt;a href=&#34;https://academic.oup.com/ser/advance-article-abstract/doi/10.1093/ser/mwaf057/8317764&#34;&gt;book symposium&lt;/a&gt; on &lt;a href=&#34;https://theordinalsociety.com&#34;&gt;&lt;em&gt;The Ordinal Society&lt;/em&gt;&lt;/a&gt; with contributions from Nitsan Chorev, J.P. Pardo-Guerra, and Greta Krippner, followed by a reply from Marion and myself. (Here&amp;rsquo;s a &lt;a href=&#34;https://kieranhealy/org/files/papers/tos-keeping-score.pdf&#34;&gt;PDF of the exchange&lt;/a&gt;.) There&amp;rsquo;s also a symposium at the &lt;em&gt;Journal of Cultural Economy&lt;/em&gt; with contributions from &lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/17530350.2025.2525986&#34;&gt;Hatim A. Rahman&lt;/a&gt;, &lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/17530350.2025.2523768&#34;&gt;Juan M. del Nido&lt;/a&gt;, &lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/17530350.2025.2523771&#34;&gt;Julien Migozzi&lt;/a&gt;, and &lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/17530350.2025.2531852&#34;&gt;Michelle Jackson&lt;/a&gt;, again with a &lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/17530350.2025.2523767&#34;&gt;reply from us&lt;/a&gt;. Finally there&amp;rsquo;s also a new review from &lt;a href=&#34;https://journals.sagepub.com/doi/abs/10.1177/00018392251394664&#34;&gt;Michael Sauder in &lt;em&gt;ASQ&lt;/em&gt;&lt;/a&gt; (&lt;a href=&#34;sauder-ordinal-society.pdf&#34;&gt;PDF&lt;/a&gt;), which follows on reviews from &lt;a href=&#34;https://academic.oup.com/sf/article-abstract/103/4/e18/7964715&#34;&gt;Barbara Kiviat in &lt;em&gt;Social Forces&lt;/em&gt;&lt;/a&gt; (&lt;a href=&#34;kiviat-ordinal-society.pdf&#34;&gt;PDF&lt;/a&gt;), and Laura Nelson in &lt;a href=&#34;https://www.lauraknelson.com/publication/book-review-ordinal-society/ordinal-society-review.pdf&#34;&gt;&lt;em&gt;Acta Sociologica&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As we say more than once in our replies, it&amp;rsquo;s very gratifying to have your peers in the field take the time to read your work.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Trustworthy Data Visualization</title>
      <link>https://kieranhealy.org/blog/archives/2025/11/10/trustworthy-data-visualization/</link>
      <pubDate>Mon, 10 Nov 2025 08:26:45 -0500</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/11/10/trustworthy-data-visualization/</guid>
      <description>&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube-nocookie.com/embed/ZamPCbvBAgE?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;p&gt;This past September I gave the closing keynote at &lt;a href=&#34;https://posit.co/conference/&#34;&gt;posit::conf&lt;/a&gt;; it&amp;rsquo;s now on YouTube to watch. Keen-eyed observers will note from the title that it&amp;rsquo;s about trustworthy data visualization. But it&amp;rsquo;s also about trust a bit more generally, and how we should think about it in a world where  researchers are faking results, AIs are enthusiastically confabulating, and government is destroying data infrastructure. When you find yourself giving a talk with a little tiny microphone stuck to the side of your head you have to ask yourself some hard questions, but the talk was partly about that.&lt;/p&gt;
&lt;p&gt;One of the nice things about the opportunity to give a talk like this to a large audience is that you can tell the audience not just about your own stuff, but also the work you build from and rely on. This is, as you&amp;rsquo;ll see if you watch, one of the themes of the talk in the first place. In my case, in addition to all the good things done by people in the general area of data visualization and data science, I was able to talk about some ideas from social theory and, in addition, some of the work of &lt;a href=&#34;https://en.wikipedia.org/wiki/Katherine_Hawley&#34;&gt;Katherine Hawley&lt;/a&gt; on trust and commitment. The fulcrum of the talk is essentially an idea about commitment that she developed in her book &lt;a href=&#34;https://academic.oup.com/book/32233&#34;&gt;&lt;em&gt;How to be Trustworthy&lt;/em&gt;&lt;/a&gt;. I got to know Katherine and her family a little bit over a few summers in the 2010s when my own family used to visit St Andrews, where she was Professor of Philosophy. She died of cancer at the age of fifty in 2021. She was a wonderful person; a clear-minded philosopher of enviable intellectual gifts but also immense personal kindness. It was good to be able to think of her while writing this talk and, in however small a way, introduce some of her ideas to a new audience when I gave it.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Mamdani vs Sliwa and Cuomo</title>
      <link>https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/</link>
      <pubDate>Thu, 06 Nov 2025 12:57:44 -0500</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/</guid>
      <description>&lt;p&gt;Mamdani&amp;rsquo;s victory in the New York City mayoral election gave me the opportunity to draw a few maps, and also to learn a bit about incorporating additional spatial data into maps drawn in R. R is not a specialized piece of GIS software. ESRI&amp;rsquo;s &lt;a href=&#34;https://www.arcgis.com/&#34;&gt;ArcGIS&lt;/a&gt; is the 800lb gorilla in this world and &lt;a href=&#34;https://qgis.org&#34;&gt;QGIS&lt;/a&gt;  the &lt;a href=&#34;https://www.gimp.org&#34;&gt;GIMP&lt;/a&gt; to its Photoshop, so to speak.&lt;/p&gt;
&lt;p&gt;Still, you can do a lot of spatial stuff in R, grounded in the &lt;a href=&#34;https://r-spatial.github.io/sf/&#34;&gt;&lt;code&gt;sf&lt;/code&gt; package&lt;/a&gt; and its many friends. Plus you get the benefit of all the data manipulation and analysis that R is really good at. So, having gotten the precinct-level results for the election, some maps from New York City (e.g., the &lt;a href=&#34;https://www.nyc.gov/content/planning/pages/resources/datasets/borough-boundaries&#34;&gt;clipped borough boundaries map&lt;/a&gt;), and &lt;a href=&#34;https://www.mta.info/developers&#34;&gt;GTFS data from the MTA&lt;/a&gt; describing the structure of the subway system, I was able to draw some things. I strongly approve of the existence of the &lt;a href=&#34;https://gtfs.org&#34;&gt;GTFS&lt;/a&gt;, by the way. It&amp;rsquo;s a spec for encoding transit data and lots of cities use it. Really handy.&lt;/p&gt;
&lt;p&gt;Anyway, here&amp;rsquo;s a map. For each precinct with more than twenty voters, I combined the Sliwa/Cuomo vote into what we might call (purely for compactness reasons) the Slimo vote, calculated the Mamdani and Slimo vote shares as a proportion, and subtracted the former from the latter. That gets us a score raning from -1 to +1. I then cut that into bins, ten on each side of the zero line, to get deciles in each direction. That&amp;rsquo;s what we fill the precincts with. For the map I also drew the subway stations and lines. Several lines that are right next to each other are in effect drawn on top of one another in several places, e.g. the A and the C, or the 1 and the 3, etc, but that doesn&amp;rsquo;t matter for this map. (You may have heard that drawing transit maps meant for navigation is a really hard information design challenge.) We then use a discrete, diverging scale. Here&amp;rsquo;s the result.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo.png&#34;
         alt=&#34;Precinct-level vote shares for Mamdani vs Sliwa/Cuomo&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Oh, choropleths&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I chose a blue-green color gradient partly to experiment with it and partly because neither the election nor this particular cut of the data is quite the usual Blue vs Red, Democrat vs Republican. For one thing we have amalgamated two candidates on one side of the spectrum. For another, Cuomo is in some sense a Democrat, so the way voters were split is trickier than it normally would be. &lt;a href=&#34;https://statmodeling.stat.columbia.edu/2025/11/06/if-cuomo-had-been-able-to-run-against-mamdani-head-to-head/&#34;&gt;Andrew Gelman has some more thoughts on this today&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can see a bunch of nice neighborhood patterns, such as e.g. the Hasidic communities in Brooklyn who voted strongly for Cuomo. And you can also see evidence of the &lt;a href=&#34;https://kieranhealy.org/blog/archives/2015/06/12/americas-ur-choropleths/&#34;&gt;characteristic weakness of choropleths&lt;/a&gt;, which is the way they can force a number characterizing some number of persons to be represented by a shape representing some area of space.&lt;/p&gt;
&lt;p&gt;One solution to this is to make a dot-density map. You put a dot on the map for every person (or maybe every n people) you want to represent. Here&amp;rsquo;s what that looks like, with a 1-to-1 representation of dots to voters.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-dotmap300.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-dotmap300.png&#34;
         alt=&#34;Precinct-level vote shares for Mamdani vs Sliwa/Cuomo, dot-density version&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Dot dot dot.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This is better than a choropleth in some key ways. For one thing, you can see that&amp;mdash;even in New York City&amp;mdash;some places are both much more densely populated than others and also more likely to turn out to vote. To be clear, when I say &amp;ldquo;each dot represents one vote&amp;rdquo;, it&amp;rsquo;s not as if I know the identity of every voter, or how they voted, or can precisely locate them to their home address. I promise I don&amp;rsquo;t know that. Only the NSA and the Phone Company know that stuff. What I do know is how many votes were cast for each candidate within each of the 4,200 or so precincts. And I have a polygon that represents the shape of each precinct. So I spatially sample without replacement within each polygon to randomly place a dot for each voter within their precinct. With two million or so votes the pointillist effect ends up being quite effective.&lt;/p&gt;
&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-detail2.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-detail2.png&#34;
         alt=&#34;Sample&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;A slice across Manhattan in the upper 50s and lower 60s and over into Queens.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The main reason for doing this is that, from the point of view of the election, precincts aren&amp;rsquo;t real. They are closely related to the social geography of the city, which is one of the reasons we want to draw a map like this, but in and of themselves they are at best proxies for the thing we care about, so we should take care not to reify them. The sampling method still brings out the basis of the data, so you can see the precinct-patchwork that forms the underlying grid, especially in densely-populated areas. But those polygons are filled in proportion to the number of people who actually voted. I saw someone on social media observe that this &amp;ldquo;conflated&amp;rdquo; partisan lean and population density. But again, precincts aren&amp;rsquo;t real. The distribution of partsian lean across population density is what we&amp;rsquo;re trying to bring out with this quasi-person-level approach.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-detail3.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/11/06/mamdani-vs-sliwa-and-cuomo/subway-mamdani-slimo-detail3.png&#34;
         alt=&#34;Precinct-level vote shares for Mamdani vs Sliwa/Cuomo, dot-density version, high-res detail&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Detail of the PDF.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;There are costs, of course. Color choice is harder, especially with more than one category of dot at once. The image tends to look less bright because you&amp;rsquo;re not using a big swatch of paint across a large area. The visual impression is also very sensitive to small changes in the size of the dots and to settings like their alpha transparency. The order that the layers are drawn also matters a great deal at this sort of resolution. In addition, when you draw a few million dots on a large grid of pixels then your file size gets big real fast. The main dot-density image above is a 300dpi 4,500 by 4,500 pixel PNG and it&amp;rsquo;s 12mb in size before being crushed down further (with &lt;code&gt;optipng&lt;/code&gt;) to 9mb or so. I could make a JPG of course, which would be a lot smaller, but then you start running into the question of why you made a dot-density map in the first place, because you lose detail in the raster. In fairness, you&amp;rsquo;d still get the benefit of having larger and less densely-populated (or lower-turnout) precincts not appear fully filled-in, even if you wouldn&amp;rsquo;t be able to zoom in at all.&lt;/p&gt;
&lt;p&gt;Meanwhile, thanks to the wonders of multiplication, a 600dpi version of the same image is much sharper to zoom in on but is also about 35mb in size. A PDF, which is a vector rather than a raster format, is even bigger, weighing in at 63MB. Rendering a PDF with that many vector elements does not make Preview or Illustrator happy, let me tell you. The benefit of course is that you can scale it up to any size you like without loss, as shown above. Because browsers are crazy and so is javascript, it&amp;rsquo;s possible to serve up dot density maps like this in real time in a way that makes them zoomable and fluid, but that&amp;rsquo;s not my department.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>GSS Release</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/28/gss-release/</link>
      <pubDate>Tue, 28 Oct 2025 11:14:15 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/28/gss-release/</guid>
      <description>&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/28/gss-release/gss-immigration.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/28/gss-release/gss-immigration.png&#34;
         alt=&#34;GSS immigration question&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Trends in the &lt;code&gt;immameco&lt;/code&gt; question.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Release 2 of the 2024 GSS cross-section and 1972-2024 culumative data are &lt;a href=&#34;https://gss.norc.org/get-the-data.html&#34;&gt;now available&lt;/a&gt;. I&amp;rsquo;ve updated &lt;a href=&#34;https://kjhealy.github.io/gssr/&#34;&gt;&lt;code&gt;gssr&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/&#34;&gt;&lt;code&gt;gssrdoc&lt;/code&gt;&lt;/a&gt; to incorporate them. There are quite a few changes in the data and variables, thanks in part to some changes in data collection methods and a privacy/disclosure review.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;gssr&lt;/code&gt; and &lt;code&gt;gssrdoc&lt;/code&gt; packages are the nicest way to get General Social Survey data up and running in R. The figure above shows (survey-weighted) trends derived from the &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/reference/immameco.html&#34;&gt;&lt;code&gt;immameco&lt;/code&gt;&lt;/a&gt; question.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Manhattan Plot of Manhattan</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/25/manhattan-plot-of-manhattan/</link>
      <pubDate>Sat, 25 Oct 2025 11:38:02 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/25/manhattan-plot-of-manhattan/</guid>
      <description>&lt;figure class=&#34;full-width&#34;&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/25/manhattan-plot-of-manhattan/skyline-plot.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/25/manhattan-plot-of-manhattan/skyline-plot.png&#34;
         alt=&#34;Skyline plot&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Here I continue my efforts to design visualizations that are as poorly-suited as possible to being displayed on phones. It looks pretty good on a big monitor, or six feet wide on a wall.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I made a version of this plot a few years ago. I ended up revisiting it this morning  because I&amp;rsquo;m updating various datasets and code. A &lt;a href=&#34;https://en.wikipedia.org/wiki/Manhattan_plot&#34;&gt;Manhattan
plot&lt;/a&gt; is a term sometimes used to describe a kind of scatter plot where the x-values are fairly continuous, and
the y values have distributions with long tails, so the plot looks like a skyline. This one here is a bar chart rather than a scatter plot but it&amp;rsquo;s still a kind of Manhattan plot of Manhattan.&lt;/p&gt;
&lt;p&gt;The plot shows the heights of almost all currently-existing buildings&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; in Manhattan (on the y-axis) by their year of construction on the x-axis. What I want is a plot that gives a sense of the distribution of building heights over time. To make the plot work I play a few tricks. First, the resolution of the x-axis is only to the year,which would result in way too much overplotting. (We have almost 35,000 buildings to draw.) So we add a small amount of random noise to the x-values, which makes buildings distribute themselves around their year of construction. There&amp;rsquo;s still overplotting, but now it works in our favor. It contributes to a feeling of building density.&lt;/p&gt;
&lt;p&gt;Second, there are so many buildings that we can&amp;rsquo;t plot everything as solid, filled rectangle. Instead, we make make the outlines of each rectangle a very thin white line, so everything looks like a vector-driven video game from 1981. Then we make a variable that bins buildings by height into ten categories, one for every hundred feet of additional roof height. We map that to fill color of the rectangles (darker purples for shorter buildings through bright yellow for taller ones), which means the taller a building the brighter it looks. But again, we can&amp;rsquo;t just plot those as solid colors. So we also take the roof heights and rescale them to a range of 0.4 to 0.85. Then we map &lt;em&gt;that&lt;/em&gt; number directly to the alpha channel of the fill color. So taller buildings are not only brighter in color, they are more opaque. Alpha runs from 0 (fully transparent) to 1 (fully opaque). The specific values of 0.4 to 0.85 are just from trial and error. We want the taller buildings to stand out, and because there are fewer of them they need to be more opaque. Whereas the fills of many more shorter buildings will overlap, so they can be individually more transparent.&lt;/p&gt;
&lt;p&gt;You can really see the recent rise of supertall fancy apartment buildings in Manhattan in the last ten years or so&amp;mdash;buildings like the curséd &lt;a href=&#34;https://en.wikipedia.org/wiki/111_West_57th_Street&#34;&gt;111 West 57th&lt;/a&gt; and
&lt;a href=&#34;https://en.wikipedia.org/wiki/53W53&#34;&gt;53 West 53rd&lt;/a&gt;. The cursédness 111 W57th extends to dataset:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;manhat&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;filter&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;heightroof&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1428&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; Simple feature collection with 1 feature and 19 fields&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; Geometry type: POLYGON&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; Dimension:     XY&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; Bounding box:  xmin: 990374 ymin: 217856.8 xmax: 990559.1 ymax: 218081.1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; Projected CRS: NAD83 / New York Long Island (ftUS)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;   name     bin cnstrct_yr date_lstmo   time_lstmo lststatype doitt_id heightroof feat_code groundelev shape_area shape_len   base_bbl&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 1 &amp;lt;NA&amp;gt; 1023728       1924 2021-01-04 00:00:00.000     Merged  1260269       1428      2100         58          0         0 1010100025&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;   mpluto_bbl geomsource BoroCode  BoroName Shape_Leng Shape_Area                       geometry&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 1 1010107507 Other (Man        1 Manhattan   359993.1  636620786 POLYGON ((990461.3 217856.8...&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;It&amp;rsquo;s recorded as having been completed in 1924, instead of 2019. Why? There&amp;rsquo;s a hint in the &lt;code&gt;lststatype&lt;/code&gt; column, which says &amp;ldquo;Merged&amp;rdquo;. Technically the building took over and &amp;ldquo;renovated&amp;rdquo; Steinway Hall, which &lt;em&gt;was&lt;/em&gt; built in 1924. There are a few cases like this in the dataset for buildings that are going to be salient in the figure&amp;mdash;i.e. very tall ones. There&amp;rsquo;s also some missing data, with eight buildings over 600 feet tall where the construction year is not available. At least two of those are because the building was still under construction when the data were recorded. As always, 90% of data analysis is data cleaning.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Data on building heights and construction years come from the &lt;a href=&#34;https://data.cityofnewyork.us/Housing-Development/Building-Footprints/nqwf-w8eh&#34;&gt;NYC Open Data portal&lt;/a&gt;. The data are restricted to buildings constructed after 1899 that are currently standing in Manhattan.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    
    
    <item>
      <title>gssrdoc Updates</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/</link>
      <pubDate>Sun, 19 Oct 2025 10:50:04 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/</guid>
      <description>&lt;p&gt;Regular readers know that I maintain &lt;a href=&#34;https://kjhealy.github.io/gssr/&#34;&gt;&lt;code&gt;gssr&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/&#34;&gt;&lt;code&gt;gssrdoc&lt;/code&gt;&lt;/a&gt;, two packages for &lt;a href=&#34;https://www.r-project.org&#34;&gt;R&lt;/a&gt;. The former makes the &lt;a href=&#34;http://gss.norc.org/&#34;&gt;General Social Survey&lt;/a&gt;&amp;rsquo;s annual, cumulative and panel datasets available in a way that&amp;rsquo;s easy to use in R. The latter makes the survey&amp;rsquo;s codebook available in R&amp;rsquo;s integrated help system in a way that documents every GSS variable as if it were a function or object in R, so you can query them in exactly the same way as any function from the R console or in the IDE of your choice. As a bonus, because I use &lt;a href=&#34;https://pkgdown.r-lib.org&#34;&gt;&lt;code&gt;pkgdown&lt;/code&gt;&lt;/a&gt; to document the packages, I get a website as a side-effect. In the case of &lt;code&gt;gssrdoc&lt;/code&gt; this means &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/reference/index.html&#34;&gt;a browsable index of all the GSS variables&lt;/a&gt;. The GSS is the Hubble Space Telescope of American social science; our longest-running representative view of many aspects of the character and opinions of American households. The data is &lt;a href=&#34;https://gss.norc.org&#34;&gt;freely available from NORC&lt;/a&gt;, but they distribute it in SPSS, SAS, and STATA formats. I wrote these packages in an effort to make it more easily available in &lt;a href=&#34;https://www.r-project.org&#34;&gt;R&lt;/a&gt;. If you want to know the relationship between these various platforms, &lt;a href=&#34;https://kieranhealy.org/blog/archives/2019/02/07/statswars/&#34;&gt;I have you covered&lt;/a&gt;. But the important thing is that R is a free and open-source project, and the others are not.&lt;/p&gt;
&lt;p&gt;This week I spent a little time updating &lt;code&gt;gssrdoc&lt;/code&gt; a bit to clean up how the help pages looked and make some other improvements. Inside R, you can say, e.g., &lt;code&gt;?govdook&lt;/code&gt; at the console and have this pop up in the help:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-rstudio.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-rstudio.png&#34;
         alt=&#34;RStudio with help page for govdook&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Yeah govdook is short for &amp;lsquo;Gov Do OK&amp;rsquo;, not &amp;lsquo;Go v Dook&amp;rsquo;.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The package also includes &lt;code&gt;gss_doc&lt;/code&gt;, a data frame containing all of the information that the help pages are built from. I included it because it can be useful to work with directly, as when you might want to extract summary information about a subset of variables.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;library&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;tibble&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;library&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;gssrdoc&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;gss_doc&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 6,694 × 10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    variable description                           question         value_labels var_yrtab yrballot_df module_df subject_df norc_id norc_url&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt;                                 &amp;lt;chr&amp;gt;            &amp;lt;chr&amp;gt;        &amp;lt;list&amp;gt;    &amp;lt;list&amp;gt;      &amp;lt;list&amp;gt;    &amp;lt;list&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;   &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 year     GSS year for this respondent          &amp;#34;GSS year&amp;#34;       &amp;#34;[NA(d)] do… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         1 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 id       Respondent id number                  &amp;#34;Respondent id … &amp;#34;&amp;#34;           &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         2 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 wrkstat  labor force status                    &amp;#34;Last week were… &amp;#34;[1] workin… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         3 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 hrs1     number of hours worked last week      &amp;#34;Last week were… &amp;#34;[89] 89+ h… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         4 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 hrs2     number of hours usually work a week   &amp;#34;Last week were… &amp;#34;[89] 89+ h… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         5 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 evwork   ever work as long as one year         &amp;#34;Last week were… &amp;#34;[1] yes / … &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         6 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 occ      R&amp;#39;s census occupation code (1970)     &amp;#34;A. What kind o… &amp;#34;[NA(d)] do… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         7 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 prestige r&amp;#39;s occupational prestige score(1970) &amp;#34;A. What kind o… &amp;#34;[NA(d)] do… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         8 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 wrkslf   r self-emp or works for somebody      &amp;#34;A. What kind o… &amp;#34;[1] self-e… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         9 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 wrkgovt  govt or private employee              &amp;#34;A. What kind o… &amp;#34;[1] govern… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;        10 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 6,684 more rows&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;gss_doc&lt;/code&gt; object has regular columns but also a series of &lt;a href=&#34;https://tidyr.tidyverse.org/articles/nest.html&#34;&gt;list-columns&lt;/a&gt; to (insert meme here, you know the one) put data frames inside your data frames. (They&amp;rsquo;re labeled as &amp;ldquo;&lt;a href=&#34;https://tibble.tidyverse.org&#34;&gt;tibbles&lt;/a&gt;&amp;rdquo; here; basically the same thing).&lt;/p&gt;
&lt;p&gt;Why a list-column? Why a list? Well, a list is one of the fundamental ways to store data of any sort. Lists are useful because they can contain heterogeneous elements:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;items&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;list&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;todo_home&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Laundry&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Clean bathroom&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Feed cat&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Bring out rubbish bins&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;important_dates&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;as.Date&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;1776-07-04&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;1788-06-21&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;2025-01-18&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;keycode&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;8675309&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;storage_tiers&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;128&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;256&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;512&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1024&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;items&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $todo_home&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;Laundry&amp;#34;                &amp;#34;Clean bathroom&amp;#34;         &amp;#34;Feed cat&amp;#34;               &amp;#34;Bring out rubbish bins&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $important_dates&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;1776-07-04&amp;#34; &amp;#34;1788-06-21&amp;#34; &amp;#34;2025-01-18&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $keycode&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] 8675309&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $storage_tiers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1]  128  256  512 1024&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;One thing to notice about a list like this is that it doesn&amp;rsquo;t really make sense to represent it as a table. This is partly because the elements of the list are of different lengths, but really it&amp;rsquo;s because if we &lt;em&gt;did&lt;/em&gt; represent it as a table, it would not mean anything to read across the rows:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;items_df&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 4 × 4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;   todo_home              important_dates keycode storage_tiers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;   &amp;lt;chr&amp;gt;                  &amp;lt;date&amp;gt;            &amp;lt;int&amp;gt;         &amp;lt;int&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 1 Laundry                1776-07-04      8675309           128&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 2 Clean bathroom         1788-06-21           -            256&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 3 Feed cat               2025-01-18           -            512&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 4 Bring out rubbish bins -                    -           1024&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;The rows don&amp;rsquo;t form &amp;ldquo;cases&amp;rdquo; of anything. We just have four unrelated categories with various pieces of information in them.&lt;/p&gt;
&lt;p&gt;Lists are also useful because they lend themselves easily to being nested:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;22
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;23
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;24
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;25
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;26
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;27
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;28
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;29
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;30
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;31
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;32
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;33
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;34
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;35
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;36
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;37
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;38
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;items&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;list&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;todo_home&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;list&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;tasks&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Laundry&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Clean bathroom&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Feed cat&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Bring out rubbish bins&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;tobuy&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Cat Food&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Burritos&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;wifi_password&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;p@ssw0rd!&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;important_dates&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;as.Date&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;1776-07-04&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;1788-06-21&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;2025-01-18&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;keycode&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;8675309&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;storage_tiers&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;list&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;ssd&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;128&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;256&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;512&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1024&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;ram&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;4&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;8&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;items&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $todo_home&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $todo_home$tasks&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;Laundry&amp;#34;                &amp;#34;Clean bathroom&amp;#34;         &amp;#34;Feed cat&amp;#34;               &amp;#34;Bring out rubbish bins&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $todo_home$tobuy&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;Cat Food&amp;#34; &amp;#34;Burritos&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $todo_home$wifi_password&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;p@ssw0rd!&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $important_dates&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] &amp;#34;1776-07-04&amp;#34; &amp;#34;1788-06-21&amp;#34; &amp;#34;2025-01-18&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $keycode&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] 8675309&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $storage_tiers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $storage_tiers$ssd&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1]  128  256  512 1024&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; $storage_tiers$ram&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [1] 1 4 8&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;In its bones, R is a LISP/Scheme-like list-processing language fused with features of classic &lt;a href=&#34;https://en.wikipedia.org/wiki/Array_programming&#34;&gt;array languages&lt;/a&gt; like APL. This is because, in the world of data analysis, what we deal with all the time are rectangular tables, or arrays, where rows are cases and columns are different sorts of variables. The wrinkle is that, unlike a beautiful array of pure numbers, each column might measure something (a date, a True/False answer, a location, a score, a nationality) that we&amp;rsquo;d prefer not to represent directly as a number. Sure, underneath in the computer everything is all just ones and zeros. (Or rather, electromagnetic patterns in some physical substrate that we can interpret as meaning ones and zeros.) And if we want to do any sort of data analysis that involves treating our table as a matrix then we&amp;rsquo;ll want numeric representations of all the columns. But for many uses we&amp;rsquo;d like to see &amp;ldquo;France&amp;rdquo; or &amp;ldquo;Strongly Agree&amp;rdquo; instead of &amp;ldquo;33&amp;rdquo; or &amp;ldquo;5&amp;rdquo;. Just a table of rows and columns, where different things can be represented across columns, but any particular column is all the same kind of thing.&lt;/p&gt;
&lt;p&gt;A rectangular table like that is called a data frame. One way to think of a data frame is just as a special case of a list. A data frame is a list where you can put all the list elements side by side and treat them as columns, and where all these elements are made of vectors of the same length. Beyond that, it&amp;rsquo;s a list where the nth element of each vector refers to some property of the same underlying entity, i.e. the thing that&amp;rsquo;s in the row, or case; the thing the columns are showing you measurements or properties of. You can have empty entries if needed, as when some bit of data is missing. The important thing is that each column has as many slots as there are cases, and you fill in the values for each case in the same slot in each column. Whenever you look at any table of data, one of your first questions should always be &amp;ldquo;What is a row in this table?&amp;rdquo; In this case, each row is a variable in the full GSS dataset, and each column describes some property of that variable.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;library&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;tibble&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;library&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;gssrdoc&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;gss_doc&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 6,694 × 10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    variable description                           question         value_labels var_yrtab yrballot_df module_df subject_df norc_id norc_url&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt;                                 &amp;lt;chr&amp;gt;            &amp;lt;chr&amp;gt;        &amp;lt;list&amp;gt;    &amp;lt;list&amp;gt;      &amp;lt;list&amp;gt;    &amp;lt;list&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;   &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 year     GSS year for this respondent          &amp;#34;GSS year&amp;#34;       &amp;#34;[NA(d)] do… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         1 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 id       Respondent id number                  &amp;#34;Respondent id … &amp;#34;&amp;#34;           &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         2 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 wrkstat  labor force status                    &amp;#34;Last week were… &amp;#34;[1] workin… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         3 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 hrs1     number of hours worked last week      &amp;#34;Last week were… &amp;#34;[89] 89+ h… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         4 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 hrs2     number of hours usually work a week   &amp;#34;Last week were… &amp;#34;[89] 89+ h… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         5 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 evwork   ever work as long as one year         &amp;#34;Last week were… &amp;#34;[1] yes / … &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         6 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 occ      R&amp;#39;s census occupation code (1970)     &amp;#34;A. What kind o… &amp;#34;[NA(d)] do… &amp;lt;chr [1]&amp;gt; &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         7 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 prestige r&amp;#39;s occupational prestige score(1970) &amp;#34;A. What kind o… &amp;#34;[NA(d)] do… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         8 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 wrkslf   r self-emp or works for somebody      &amp;#34;A. What kind o… &amp;#34;[1] self-e… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;         9 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 wrkgovt  govt or private employee              &amp;#34;A. What kind o… &amp;#34;[1] govern… &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;    &amp;lt;tibble&amp;gt;  &amp;lt;tibble&amp;gt;        10 https:/…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 6,684 more rows&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Because R was designed by statisticians&amp;mdash;R is a descendant of &lt;a href=&#34;https://en.wikipedia.org/wiki/S_(programming_language)&#34;&gt;S&lt;/a&gt;, which like everything else in computing traces its origins to &lt;a href=&#34;https://en.wikipedia.org/wiki/S_(programming_language)&#34;&gt;Bell Labs&lt;/a&gt;&amp;mdash;it has this concept of a data frame built-in to its core instead of being bolted-on afterwards, which is extremely handy. Normally data frames are just ordinary rectangles, but there&amp;rsquo;s no reason why any particular column can&amp;rsquo;t itself be thought of as a list of something else. That&amp;rsquo;s what we have here. The &lt;code&gt;yr_vartab&lt;/code&gt; column contains data frames of crosstabs of the answers to each question by year. Except where it doesn&amp;rsquo;t (e.g. for &lt;code&gt;id&lt;/code&gt;), and this is fine because lists don&amp;rsquo;t have to be internally homogeneous. Similarly &lt;code&gt;yrballot_df&lt;/code&gt; has a little table of which ballots, or internal portions of the survey, a question was asked on for each year it was asked.&lt;/p&gt;
&lt;p&gt;The upshot is that having assembled the &lt;code&gt;gss_doc&lt;/code&gt; object we can use it to emit, like, seven thousand pages of documentation on the GSS&amp;rsquo;s many, many questions over the years. We can build them as standardized R help pages, as above. On the &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/index.html&#34;&gt;website&lt;/a&gt; that &lt;code&gt;pgkdown&lt;/code&gt; builds for us, we get this:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-varpage.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-varpage.png&#34;
         alt=&#34;Website view.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Website view.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The cross-referencing to other relevant variables in the &amp;ldquo;See Also&amp;rdquo; section is new in this version. It comes courtesy of the GSS&amp;rsquo;s own information about survey modules and an ad hoc topic index they keep for the variables. I just use a subset of possible cross-references as we don&amp;rsquo;t want, e.g., every single question in the GSS core to be cross-referenced to every other core question on any particular help page. On the website, I gather these into a &lt;a href=&#34;https://kjhealy.github.io/gssrdoc/articles/topics.html&#34;&gt;single page&lt;/a&gt;:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-topics.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/19/gssrdoc-updates/gssrdoc-topics.png&#34;
         alt=&#34;Topic index page.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Topic index page.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The GSS has its own handy &lt;a href=&#34;https://gssdataexplorer.norc.org&#34;&gt;data explorer&lt;/a&gt; which is very useful for quickly checking on particular trends and getting a quick graph of what the data look like, or a summary view of the content of particular variables. Each help page in &lt;code&gt;gssrdoc&lt;/code&gt; now links to the GSS Data Explorer page for that variable, in case you want to hop over and take a look there. Of course, the &lt;code&gt;gssrdoc&lt;/code&gt; package doesn&amp;rsquo;t and isn&amp;rsquo;t meant to replace the Data Explorer; it&amp;rsquo;s just a different view of the same information, with a different use-case in mind.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Parking Signs</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/13/parking-signs/</link>
      <pubDate>Mon, 13 Oct 2025 16:01:57 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/13/parking-signs/</guid>
      <description>&lt;p&gt;If you want to print out a poster for &lt;a href=&#34;https://www.nokings.org&#34;&gt;October 18th&lt;/a&gt;, here are two; fully-guaranteed countrywide but maybe especially suitable for in and around New York.&lt;/p&gt;
&lt;figure&gt;
&lt;div style=&#34;display: flex; gap: 1rem;&#34;&gt;
&lt;div style=&#34;height: 375px; width: 250px;&#34;&gt;
&lt;p&gt;&lt;img alt=&#34;&#34; height=&#34;7275&#34; id=&#34;h-rh-i-0&#34; src=&#34;https://kieranhealy.org/blog/archives/2025/10/13/parking-signs/no-king.jpg&#34; width=&#34;4875&#34;&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div style=&#34;height: 375px; width: 250px;&#34;&gt;
&lt;p&gt;&lt;img alt=&#34;&#34; height=&#34;7200&#34; id=&#34;h-rh-i-1&#34; src=&#34;https://kieranhealy.org/blog/archives/2025/10/13/parking-signs/two-term.jpg&#34; width=&#34;4800&#34;&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;figcaption&gt;&lt;h4&gt;Two signs to display&lt;/h4&gt;
&lt;/figure&gt;
&lt;p&gt;Here&amp;rsquo;s a direct link to the &lt;a href=&#34;no-king.jpg&#34;&gt;No King Anytime&lt;/a&gt; JPG and the &lt;a href=&#34;two-term.jpg&#34;&gt;Two Term Limit&lt;/a&gt; JPG.  These are available as PDFs, as well (vector outlines; no worries about fonts): here&amp;rsquo;s the &lt;a href=&#34;no-king.pdf&#34;&gt;No King Anytime&lt;/a&gt; PDF and the &lt;a href=&#34;two-term.pdf&#34;&gt;Two Term Limit&lt;/a&gt; PDF. Put them onna stick and exercise the constitutional rights to freedom of expression, speech, and assembly enjoyed by &lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/06/28/american/&#34;&gt;everyone in the United States&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Halloween in the Round</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/</link>
      <pubDate>Wed, 08 Oct 2025 09:38:09 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2024/10/12/halloween-data-cleaning/&#34;&gt;Last year&lt;/a&gt; I wrote about cleaning some data from the &lt;a href=&#34;https://www.nhtsa.gov/research-data/fatality-analysis-reporting-system-fars&#34;&gt;NHTSA FARS database&lt;/a&gt;, the system that tracks information about road accidents in the United States. I did that again this year for my &lt;a href=&#34;https://mptc.io/&#34;&gt;Modern Plain Text Computing&lt;/a&gt; class. I won&amp;rsquo;t repeat the cleaning details, which are more or less the same. One question is how to aggregate data like this, if at all, and how to draw a picture of it. There are, as always, lots of options. The data as obtained (with a &lt;em&gt;slightly&lt;/em&gt; different query from last year; a small lesson in itself about repeated measures) arrive as counts of pedestrian fatalities (in motor vehicle accidents) for each day of the year, within months, for each year from 2009 to 2023.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/fars-fatalities.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/fars-fatalities.png&#34;
         alt=&#34;The data in the spreadsheet you get after a specific query to FARS on the web.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;The data in the spreadsheet you get after a specific query to FARS on the web.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;What I was interested in was patterns of daily counts, so I wanted to average by day across years. This sort of aggregation  of course gets rid of other things we might be interested in, like trends over time by year. Averaging by day means we won&amp;rsquo;t see, for example, any tendency for the number of pedestrian deaths to decrease over time. One way to draw this is a column chart with time (as day-of-the-year) on the x-axis and the average count on the y-axis.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_wide.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_wide.png&#34;
         alt=&#34;Wide version.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Wide version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This works fine. Nice and compact. One thing people tend to underestimate about graphs like this (whether as bars or lines) is that you can often compress the vertical axis a lot without loss of information. Indeed, often with long time series that&amp;rsquo;s the right thing to do anyway, because there&amp;rsquo;s nothing better for making your trends look dramatic than narrowing the horizontal part of the aspect ratio. Here the focus is not on a trend, but on the one day that really stands out from the others. In any case, wide is good.&lt;/p&gt;
&lt;p&gt;On the other hand, most people are looking at things on their phones. Maybe a vertical view can be better there? We could do something like this, like last year:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_tall25.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_tall25.png&#34;
         alt=&#34;Tall version.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Tall version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Here we facet by month and stack them. It works OK. This would be more useful for data where you expect a bit more structure down the rows as well as across the columns of the table you&amp;rsquo;re graphing. For example, accidents vary by day of the week, especially pedestrian accidents. There are more people out at the weekend. And then for some parts of the country the seasonality of pedestrian activity would be stronger than in others, because it&amp;rsquo;s nicer to be outside some months of the year than others. (You can in fact get counts by day of the week from FARS, but I didn&amp;rsquo;t; exercise for the reader, etc.) Even in this one you can see some evidence of that, as Summer and (back to school) Fall have more fatalities than January, for instance.&lt;/p&gt;
&lt;p&gt;Finally, and the point of this post, we can also experiment with the fact that the year is cyclical, and use polar coordinates. When we do this we take our x-axis and wrap it around a circle. Position on the x-axis measured as a distance becomes position on a polar or radial axis measured as an angle. We call it theta and measure in degrees or radians or whatever. In ggplot &lt;code&gt;coord_polar()&lt;/code&gt; is available as a transformation to do this. One of the nice things about ggplot&amp;rsquo;s grammar-of-graphics approach is that it makes it easier to show how the same graph can change under various transformations. A standard xy plot has its coordinates set up by the &lt;code&gt;coord_cartesian()&lt;/code&gt; function. We usually never write this unless explicitly want to tweak some aspect of it, but it&amp;rsquo;s there. But we can just replace it altogether with a different coordinate transformation altogether. This can be a polar system or something else, as when we draw a map and systematically transform points on a sphere into a flat surface via some projection.&lt;/p&gt;
&lt;p&gt;Version 4 of ggplot came out last month and introduced &lt;code&gt;coord_radial()&lt;/code&gt; as an improved version of &lt;code&gt;coord_polar()&lt;/code&gt;. We can use it to draw our plot. Here we&amp;rsquo;ll use a FARS dataset that shows fatal crashes that involve pedestrians aged 17 and under. (The difference from the ones above is that pedestrians were &amp;ldquo;involved&amp;rdquo; in the crash but weren&amp;rsquo;t necessarily killed.) We write code that looks like this:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;22
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;23
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;24
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;25
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;26
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;27
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;28
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;29
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;30
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;31
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;32
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;33
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;34
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;35
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;36
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;37
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;38
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;m_breaks&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;cumsum&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;as.integer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;diff&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;seq&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;as.Date&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;2016-01-01&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                                       &lt;span class=&#34;nf&#34;&gt;as.Date&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;2017-01-01&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;by&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;month&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;-&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;16&lt;/span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;arrow_segment_df&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;tibble&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;10&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;3.4&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;xend&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;61&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;yend&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;3.2&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;p_out&lt;/span&gt;  &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;ggplot&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;data&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;fars_involved_agg&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       &lt;span class=&#34;n&#34;&gt;mapping&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;day_ind&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;flag&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;flag&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;geom_point&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;group&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;size&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;2.5&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;shape&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;21&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;geom_textsegment&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;data&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;arrow_segment_df&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;mapping&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;x&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;xend&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;xend&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;yend&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;yend&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;label&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Calendar Day&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;cornflowerblue&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;family&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Socviz Condensed&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;arrow&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;arrow&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;length&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;unit&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0.2&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;cm&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;type&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;closed&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;      &lt;span class=&#34;n&#34;&gt;inherit.aes&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;FALSE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;linewidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.5&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;annotate&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;305&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;4&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;label&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Halloween&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;size&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;5&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;hjust&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;-0.12&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                &lt;span class=&#34;n&#34;&gt;color&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;darkorange2&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;family&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Socviz Condensed&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;             &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;scale_color_manual&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;values&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;gray10&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;gray5&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;scale_fill_manual&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;values&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;kc&#34;&gt;NA&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;darkorange2&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;coord_radial&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;expand&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;FALSE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;rlim&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;4.25&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;inner.radius&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.25&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                 &lt;span class=&#34;n&#34;&gt;r.axis.inside&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;scale_x_continuous&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;breaks&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;m_breaks&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;labels&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;month.name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;minor_breaks&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;m_breaks&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;-&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;15&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;guides&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;none&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;none&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;theta&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;axis_textpath&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;labs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;NULL&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;         &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;NULL&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;         &lt;span class=&#34;n&#34;&gt;title&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Fatal Motor Vehicle Crashes involving Child Pedestrians&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;         &lt;span class=&#34;n&#34;&gt;subtitle&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Daily Means, 2009-2023&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;         &lt;span class=&#34;n&#34;&gt;caption&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Figure: Kieran Healy / Data: NHTSA Fatality Analysis Reporting System&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;panel.grid.major.x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_blank&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;n&#34;&gt;panel.grid.minor.x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_line&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;gray10&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;n&#34;&gt;panel.grid.major.y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_line&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;gray50&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;n&#34;&gt;panel.grid.minor.y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_blank&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;n&#34;&gt;axis.text&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_text&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;face&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;bold&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;        &lt;span class=&#34;n&#34;&gt;axis.ticks.theta&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;element_blank&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;())&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;This is well beyond the minimum necessary to get a servicable plot, but I got carried away polishing the thing. The key bit is the call to &lt;code&gt;coord_radial()&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;coord_radial&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;expand&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;FALSE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;             &lt;span class=&#34;n&#34;&gt;rlim&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;4.25&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;             &lt;span class=&#34;n&#34;&gt;r.axis.inside&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;             &lt;span class=&#34;n&#34;&gt;inner.radius&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.25&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;This says: draw the x-axis as a circle; don&amp;rsquo;t expand the margins (this has the effect of keeping January at the top rather than being slightly decentered); set the y-axis limits (i.e. the length of the radius) to be slightly more than the maximum of the data; put the radial axis inside the plot rather than next to it; and make the circle more like a donut by putting a circular &amp;ldquo;hole&amp;rdquo; in the middle of the graph that&amp;rsquo;s 25 percent of the total length of the radius. Donut-style versions of polar plots are often easier to read, as opposed to having everything go into the very center of the circle.&lt;/p&gt;
&lt;p&gt;The word &amp;ldquo;Halloween&amp;rdquo; is put in with &lt;code&gt;annotate()&lt;/code&gt;. The call to &lt;code&gt;geom_textsegment()&lt;/code&gt; is from the very handy &lt;a href=&#34;https://allancameron.github.io/geomtextpath/&#34;&gt;geomtextpath&lt;/a&gt; package. This lets us put text or labels on or alongside lines in a way that follows their shape. You can do this for ordinary trend lines in xy plots but it understands polar coordinates too. In addition, ggplot 4&amp;rsquo;s radial system knows about geomtextpath. If you&amp;rsquo;re using radial coordinates and geomtextpath is loaded then you can write &lt;code&gt;guides(theta = &amp;quot;axis_textpath&amp;quot;)&lt;/code&gt;. This will change the labels of the &lt;code&gt;theta&lt;/code&gt; axis (which is to say, the polar-transformed x-axis). Instead of sticking out horizontally they will pleasingly follow the path of the circle. Nice.  We also use the distinction between minor and major breakpoints for the labels and panel grid lines to put the month names on the &lt;em&gt;minor&lt;/em&gt; breaks and leave the major breaks empty (and turn off their tick marks). That puts the month names in the middle of each monthly wedge, which is better than having them at the beginning. The only other slightly fancy thing is deliberately picking a shape for the points that lets me have them be empty rings, except for the Halloween point, which is filled. Here&amp;rsquo;s what we get.&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_polar.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/08/halloween-in-the-round/halloween_polar.png&#34;
         alt=&#34;Roundy version.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Roundy version.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Not bad. You have to be careful with polar coordinates. People are much better at judging relative lengths than relative angles. This is why pie charts are hated by dataviz nerds. A pie chart is, after all, just a bar or column chart whose x-axis has been spun into a circle. In this case, our humble wide-aspect column plot does perfectly well for much less effort, and has the advantage of being immediately comprehensible by more people. There are cases where truly seasonal data can really benefit from being represented on or in a circle. This data isn&amp;rsquo;t quite one of those, as we&amp;rsquo;re just saying &amp;ldquo;Hey, look how this one day is different from all the others&amp;rdquo;. But by the same token, we&amp;rsquo;re not really asking the viewer to judge angular offsets or to estimate the area of a pie wedge. Instead it&amp;rsquo;s just distance from the inner circle. Seeing the points for all the other days cluster close to the inner ring while Halloween stands out a ways does a decent job of getting the content across, and maybe in a way friendlier to people looking at the plot on a phone.&lt;/p&gt;
&lt;p&gt;PS: One other thing. Remember, as we also noted last year, that any inference about how dangerous Halloween is for child pedestrians shouldn&amp;rsquo;t depend just on the number of fatalities observed but also on the exposure, which we don&amp;rsquo;t see directly. There are lots more children&amp;mdash;perhaps orders of magnitude more&amp;mdash;wandering around on the evening of Halloween than on a typical night, which will change the meaning of the count of pedestrian fatalities.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Iterating some sample data</title>
      <link>https://kieranhealy.org/blog/archives/2025/10/03/iterating-some-sample-data/</link>
      <pubDate>Fri, 03 Oct 2025 05:32:15 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/10/03/iterating-some-sample-data/</guid>
      <description>&lt;p&gt;I&amp;rsquo;m teaching my &lt;a href=&#34;https://mptc.io&#34;&gt;Modern Plain Text Computing&lt;/a&gt; course this semester and so I&amp;rsquo;m on the lookout for small examples that I can use to show some of the ordinary techniques we regularly use when working with tables of data. One of those is just coming up with some example data to illustrate something else, like how to draw a plot or fit a model or what have you. This is partly what the stock datasets that come bundled with packages are for, like the venerable &lt;a href=&#34;https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/mtcars.html&#34;&gt;mtcars&lt;/a&gt; or the more recent &lt;a href=&#34;https://allisonhorst.github.io/palmerpenguins/&#34;&gt;palmerpenguins&lt;/a&gt;. Sometimes, though, you end up quickly making up an example yourself. This can be a good way to practice stuff that computers are good at, like doing things repeatedly.&lt;/p&gt;
&lt;p&gt;This happened the other day in response to a question about visualizing some evaluation data. The task goes like this. You are testing a bunch of different LLMs. Say, fifteen of them. You have  trained them to return Yes/No answers when they look at repeated samples of some test data. Let&amp;rsquo;s say each LLM is asked a hundred questions. You have also had an expert person look at the same hundred questions and give you their Yes/No answers. The person&amp;rsquo;s answers are the ground truth. You want to know how the LLM performs against them. So for each LLM you have a two-by-two table showing counts or rates of true positives, true negatives, false positives, and false negatives. (This is called a &amp;ldquo;&lt;a href=&#34;https://en.wikipedia.org/wiki/Confusion_matrix&#34;&gt;confusion matrix&lt;/a&gt;&amp;rdquo;.) You want to visualize LLM performance for all the LLMs. An additional wrinkle is that, from the point of view of your business, responses are variably costly. Correct answers (true positives or true negatives) cost one unit. Then, say, a False Negative costs two units and a False Positive is worst, costing four units.&lt;/p&gt;
&lt;p&gt;The questioner wanted some thoughts on what sort of graph to draw. You can of course just picture what the data would look like and figure out which of your many stock datasets has an analogous structure. Or you&amp;rsquo;d sketch out an answer with pen and paper. In this case, even though they have problems in general, I thought a kind of stacked bar chart (but flipped on its side) might work. OK, done. But half the fun&amp;mdash;for some values of &amp;ldquo;fun&amp;rdquo;&amp;mdash;is generating data that looks like this. And as I said, I&amp;rsquo;m on the lookout for data-related examples of iteration, i.e. where I repeatedly do something and gather the results into a nice table.&lt;/p&gt;
&lt;p&gt;When we want to repeatedly do something, we first solve the base case and then we generalize it by putting in some sort of placeholder and use an engine that can iterate over an index of values, feeding each one to the placeholder. In imperative languages you might use a counter and a for loop. In a functional approach you map or apply some function.&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;ve got a hundred questions and fifteen LLMs. We imagine that the LLMs can range in accuracy from 40 percent to 99 percent in one percent steps. We&amp;rsquo;ll pick at random from within this range to set how good any specific LLM is.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nf&#34;&gt;set.seed&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;100125&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;c1&#34;&gt;# so we get the same &amp;#39;random&amp;#39; results each time&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;n_llms&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;15&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;accuracy_range&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;seq&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0.4&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.99&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.01&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Our baseline is &lt;code&gt;n_runs&lt;/code&gt; human answers with some given distribution of Yes/No answers. Let&amp;rsquo;s say 80% No, 20% Yes. It doesn&amp;rsquo;t matter what they are; the person is the ground truth. We sample with replacement a hundred times from &amp;ldquo;N&amp;rdquo; or &amp;ldquo;Y&amp;rdquo; at that probability.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;human_evals&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;sample&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;replace&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;prob&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0.8&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.20&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;human_evals&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;   [1] &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  [34] &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  [67] &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;Y&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34; &amp;#34;N&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; [100] &amp;#34;N&amp;#34;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;For each of our fifteen LLM what we want to do is generate a string of its one hundred Y/N answers in the same way, but with its particular idiosyncratic distribution of Ys and Ns, and then evaluate it against the human baseline. And we&amp;rsquo;d like to gather all the answers into a single data frame so we can keep everything tidy.&lt;/p&gt;
&lt;p&gt;Our evaluation function looks like this:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;eval_llm&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;kr&#34;&gt;function&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;llm_evals&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_eval&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;case_when&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;llm_evals&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_eval&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;True Positive&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;llm_evals&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_eval&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;False Positive&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;llm_evals&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_eval&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;False Negative&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;llm_evals&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_eval&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;True Negative&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;We generate a string of responses for an imaginary LLM just like we did for the imaginary person. The single case would look like this:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_01&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;sample&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;replace&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;prob&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;0.7&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.3&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Which we would then feed to &lt;code&gt;eval_llm()&lt;/code&gt; along with the vector of human answers. But we want to do this fifteen times, with varying values for &lt;code&gt;prob&lt;/code&gt; and also we want to put each LLM in its own column in a data frame. So we replace the values with variables. Then we evaluate all of them.&lt;/p&gt;
&lt;p&gt;First we generate a vector of LLM names. We use &lt;code&gt;str_pad&lt;/code&gt; to get sortable numbers with a leading zero:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;paste&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;LLM&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;str_pad&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;m&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;:&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;n_llms&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;width&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;2&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;pad&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;0&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;sep&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;_&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  [1] &amp;#34;LLM_01&amp;#34; &amp;#34;LLM_02&amp;#34; &amp;#34;LLM_03&amp;#34; &amp;#34;LLM_04&amp;#34; &amp;#34;LLM_05&amp;#34; &amp;#34;LLM_06&amp;#34; &amp;#34;LLM_07&amp;#34; &amp;#34;LLM_08&amp;#34; &amp;#34;LLM_09&amp;#34; &amp;#34;LLM_10&amp;#34; &amp;#34;LLM_11&amp;#34; &amp;#34;LLM_12&amp;#34; &amp;#34;LLM_13&amp;#34; &amp;#34;LLM_14&amp;#34; &amp;#34;LLM_15&amp;#34;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Now we&amp;rsquo;re going to create a little table of LLM parameters. We already created a vector of probabilities for the LLMs, &lt;code&gt;accuracy_range&lt;/code&gt;. We&amp;rsquo;ll sample from that to get fifteen values. The number of runs is fixed at a hundred. R&amp;rsquo;s naturally vectorized way of working will take care of the table getting filled in properly.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;22
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;23
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;24
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;25
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;26
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;27
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_df&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;tibble&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;p_yes&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;sample&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;accuracy_range&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_llms&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;p_no&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;-&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;p_yes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;n&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_df&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 15 × 4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    llm_id p_yes   p_no     n&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;chr&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 LLM_01  0.49 0.51     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 LLM_02  0.91 0.09     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 LLM_03  0.46 0.54     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 LLM_04  0.88 0.12     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 LLM_05  0.45 0.55     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 LLM_06  0.52 0.48     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 LLM_07  0.55 0.45     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 LLM_08  0.6  0.4      100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 LLM_09  0.65 0.35     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 LLM_10  0.61 0.39     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 11 LLM_11  0.93 0.0700   100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 12 LLM_12  0.53 0.47     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 13 LLM_13  0.94 0.0600   100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 14 LLM_14  0.69 0.31     100&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 15 LLM_15  0.92 0.0800   100&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Next we need a function that can accept each row as a series of arguments and use it to generate a vector of LLM answers:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;run_llm&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;kr&#34;&gt;function&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;p_yes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;p_no&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;tibble&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;p&#34;&gt;{{&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;llm_id&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;}}&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;:=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;sample&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;N&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Y&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n_runs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;replace&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;prob&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;p_yes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;p_no&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;This function returns a data frame that has one column and a hundred rows of Y/N answers sampled at a given probability of yes and no answers. There are two tricks. The first is, we want the name of the column to be the same as the &lt;code&gt;llm_id&lt;/code&gt;. To do this we have to &lt;a href=&#34;https://en.wikipedia.org/wiki/Quasi-quotation&#34;&gt;quasi-quote&lt;/a&gt; the &lt;code&gt;llm_id&lt;/code&gt; argument. This is what the &lt;code&gt;{{  }}&lt;/code&gt; around &lt;code&gt;llm_id&lt;/code&gt; &lt;a href=&#34;https://rlang.r-lib.org/reference/topic-data-mask.html&#34;&gt;does inside the function&lt;/a&gt;. It lets us use the value of &lt;code&gt;llm_id&lt;/code&gt; as a symbol that&amp;rsquo;ll name the column. Normally when using &lt;code&gt;tibble()&lt;/code&gt; to make a data frame we create a column with &lt;code&gt;col_name = vector_of_values&lt;/code&gt;. We did that when we made &lt;code&gt;llm_df&lt;/code&gt; a minute ago. But because we&amp;rsquo;re quasi-quoting the LLM name on the &lt;em&gt;left&lt;/em&gt; side of a naming operation, for technical reasons having to do with how R evaluates environments we can&amp;rsquo;t use &lt;code&gt;=&lt;/code&gt; as normal. Instead we have to assign the name&amp;rsquo;s contents using the excellently-named &lt;a href=&#34;https://www.tidyverse.org/blog/2020/02/glue-strings-and-tidy-eval/&#34;&gt;walrus operator&lt;/a&gt;, &lt;code&gt;:=&lt;/code&gt;. If we were quasi-quoting with &lt;code&gt;{{ }}&lt;/code&gt; on the right-hand side, an &lt;code&gt;=&lt;/code&gt; would be fine.&lt;/p&gt;
&lt;p&gt;Now we&amp;rsquo;re ready to go. We feed the &lt;code&gt;llm_df&lt;/code&gt; table a row at a time to the &lt;code&gt;run_llm&lt;/code&gt; function by using one of purrr&amp;rsquo;s &lt;a href=&#34;https://purrr.tidyverse.org&#34;&gt;map functions&lt;/a&gt;. Specifically, we use &lt;a href=&#34;https://purrr.tidyverse.org/reference/pmap.html&#34;&gt;pmap&lt;/a&gt;, which takes a list of multiple function arguments and hands them to a function. We have written our &lt;code&gt;llm_df&lt;/code&gt; columns so that the columns are named and ordered the way that our &lt;code&gt;run_llm()&lt;/code&gt; function expects, so it&amp;rsquo;s nice and compact.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_outputs&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;pmap&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;as.list&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;llm_df&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;n&#34;&gt;run_llm&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;bind_cols&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_outputs&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 100 × 15&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    LLM_01 LLM_02 LLM_03 LLM_04 LLM_05 LLM_06 LLM_07 LLM_08 LLM_09 LLM_10 LLM_11 LLM_12 LLM_13 LLM_14 LLM_15&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 N      N      N      N      Y      N      Y      Y      N      N      N      N      N      N      N     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 N      N      N      N      N      N      N      N      N      N      N      Y      N      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 N      N      N      N      N      N      N      N      N      N      N      N      Y      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 N      Y      N      N      N      N      N      N      Y      N      N      N      N      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 N      N      N      Y      Y      N      Y      N      Y      Y      N      N      N      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 N      N      N      Y      Y      N      N      Y      N      N      N      N      Y      Y      N     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 N      Y      N      N      Y      Y      Y      Y      Y      N      N      N      N      Y      N     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 N      N      N      N      N      N      N      Y      Y      N      N      N      N      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 N      N      N      N      N      N      Y      Y      Y      N      N      N      N      N      N     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 N      N      Y      N      Y      Y      N      Y      Y      N      N      N      N      N      Y     &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 90 more rows&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;We write &lt;code&gt;as.list(llm_df)&lt;/code&gt; because &lt;code&gt;pmap()&lt;/code&gt; wants its series of arguments as a list. (A data frame is just a list where each list element&amp;mdash;each column&amp;mdash;is the same length, by the way.) It returns a list of fifteen LLM runs, which we then bind by column back into a data frame. Nice.&lt;/p&gt;
&lt;p&gt;Now we can evaluate all these LLMs against our &lt;code&gt;human_evals&lt;/code&gt; vector in the same way, by mapping or applying the &lt;code&gt;eval_llm()&lt;/code&gt; function we wrote earlier. This time we can just use regular &lt;code&gt;map()&lt;/code&gt; because there&amp;rsquo;s only one varying argument, the LLM id. We take the &lt;code&gt;llm_outputs&lt;/code&gt; data frame and use an anonymous or lambda function, &lt;code&gt;\(x)&lt;/code&gt; to say &amp;ldquo;pass each column to &lt;code&gt;eval_llm()&lt;/code&gt; along with the non-varying &lt;code&gt;human_evals&lt;/code&gt; vector&amp;rdquo;. (You could write this without a lambda, too, but I find this syntax more consistent.) At the end there I deliberately convert all these character vectors to &lt;a href=&#34;https://r4ds.had.co.nz/factors.html&#34;&gt;factors&lt;/a&gt; for a reason I&amp;rsquo;ll get to momentarily.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_results&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;map&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;llm_outputs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;\&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;eval_llm&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;human_evals&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;bind_cols&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;mutate&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nf&#34;&gt;across&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;everything&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;as.factor&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 100 × 15&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    LLM_01         LLM_02         LLM_03         LLM_04         LLM_05  LLM_06 LLM_07 LLM_08 LLM_09 LLM_10 LLM_11 LLM_12 LLM_13 LLM_14 LLM_15&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;fct&amp;gt;          &amp;lt;fct&amp;gt;          &amp;lt;fct&amp;gt;          &amp;lt;fct&amp;gt;          &amp;lt;fct&amp;gt;   &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt;  &amp;lt;fct&amp;gt; &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 False Positive False Positive False Positive True Negative  True N… True … False… True … True … True … True … True … False… True … True …&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 True Positive  False Negative False Negative False Negative False … False… False… True … False… False… False… False… False… True … False…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 False Negative True Positive  False Negative False Negative False … False… True … False… False… False… False… False… False… True … False…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 False Positive False Positive True Negative  True Negative  True N… True … True … True … True … True … True … True … True … False… True …&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 True Negative  True Negative  True Negative  True Negative  False … False… False… True … False… True … True … True … True … False… False…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 False Positive True Negative  True Negative  False Positive False … False… True … False… False… True … True … True … False… True … True …&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 False Positive False Positive True Negative  False Positive True N… True … False… False… False… True … True … True … True … True … True …&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 False Negative True Positive  False Negative False Negative True P… False… False… False… False… False… False… False… False… True … True …&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 False Negative True Positive  False Negative True Positive  True P… False… True … False… True … False… False… False… False… False… False…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 True Negative  True Negative  False Positive True Negative  True N… True … False… False… False… True … True … True … False… False… False…&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 90 more rows&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Now we&amp;rsquo;re done; each LLM has been compared to the ground truth and we can construct a confusion matrix of counts for each column if we want. Let&amp;rsquo;s summarize the table, adding a cost code for the bad answers:&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;19
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;20
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;21
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;22
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;23
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;24
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;25
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;26
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;27
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_summary&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;llm_results&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;pivot_longer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;everything&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;())&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;group_by&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;value&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;.drop&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;FALSE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;tally&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;mutate&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;prop&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;n&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;sum&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;n&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;         &lt;span class=&#34;n&#34;&gt;cost&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;case_when&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;           &lt;span class=&#34;n&#34;&gt;value&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;%in%&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;c&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;True Positive&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;True Negative&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1L&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;           &lt;span class=&#34;n&#34;&gt;value&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;False Negative&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;2L&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;           &lt;span class=&#34;n&#34;&gt;value&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;False Positive&amp;#34;&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;4L&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_summary&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # A tibble: 60 × 5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # Groups:   name [15]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    name   value              n  prop  cost&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;    &amp;lt;chr&amp;gt;  &amp;lt;fct&amp;gt;          &amp;lt;int&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;int&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  1 LLM_01 False Negative    18  0.18     2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  2 LLM_01 False Positive    19  0.19     4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  3 LLM_01 True Negative     59  0.59     1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  4 LLM_01 True Positive      4  0.04     1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  5 LLM_02 False Negative    16  0.16     2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  6 LLM_02 False Positive    22  0.22     4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  7 LLM_02 True Negative     56  0.56     1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  8 LLM_02 True Positive      6  0.06     1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt;  9 LLM_03 False Negative    19  0.19     2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; 10 LLM_03 False Positive    17  0.17     4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;c1&#34;&gt;#&amp;gt; # ℹ 50 more rows&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;p&gt;Now, why did I convert the LLM results table from characters to factors? It&amp;rsquo;s because of how &lt;code&gt;dplyr&lt;/code&gt; handles table summaries. It&amp;rsquo;s possible that an LLM could get e.g. all True Positive answers, leaving the other three cells in its confusion matrix empty, i.e. with zero counts in those rows. By default, when tallying counts of character vectors, &lt;code&gt;dplyr&lt;/code&gt; drops empty groups. For some kinds of tallying that&amp;rsquo;s fine, but for others you definitely want to keep a tally of zero-count cells. With factors we can tell &lt;code&gt;dplyr&lt;/code&gt; explicitly not to drop them. (You can also set this option permanently for a given analysis.) The alternative is to &lt;a href=&#34;https://kieranhealy.org/blog/archives/2018/11/19/zero-counts-in-dplyr/&#34;&gt;ungroup and complete&lt;/a&gt; the table once its been created, explicitly adding back in the implicitly missing zero-count rows.&lt;/p&gt;
&lt;p&gt;Now that we have our table, we can graph it. As I said at the beginning, stacked bar charts are not great in many cases but it&amp;rsquo;s fine here, and better than trying to repeatedly draw fifteen confusion matrices. We don&amp;rsquo;t really care about the difference between true positives and true negatives anyway. We take the results table, merge it with the summary table, and draw our graph ordering the LLMs by performance weighted by average cost. We use a manual four-value color palette to distinguish the broadly bad from the broadly good answers.&lt;/p&gt;
&lt;div class=&#34;highlight-wrapper&#34;&gt;
    
    
        &lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;14
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;15
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;16
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;17
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;n&#34;&gt;llm_results&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;pivot_longer&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;everything&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;())&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;left_join&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;llm_summary&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;mutate&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;name&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;str_replace&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;_&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;ggplot&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;aes&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;reorder&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;cost&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;value&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;geom_bar&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;color&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;linewidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;0.25&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;scale_fill_manual&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;values&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;fourval_pal&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;guides&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;guide_legend&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;reverse&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;TRUE&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                             &lt;span class=&#34;n&#34;&gt;title.position&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;top&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                             &lt;span class=&#34;n&#34;&gt;label.position&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;bottom&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                             &lt;span class=&#34;n&#34;&gt;keywidth&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;3&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;                             &lt;span class=&#34;n&#34;&gt;nrow&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;))&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;labs&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;x&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;N Questions&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;y&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;NULL&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       &lt;span class=&#34;n&#34;&gt;fill&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;Compared with a Person the LLM Yielded ...&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       &lt;span class=&#34;n&#34;&gt;title&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;LLM Performance Relative to Human Baseline&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       &lt;span class=&#34;n&#34;&gt;subtitle&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;False Positives are twice as costly as False Negatives&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme_minimal&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nf&#34;&gt;theme&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;legend.position&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;s&#34;&gt;&amp;#34;top&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
    
&lt;/div&gt;

&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/10/03/iterating-some-sample-data/llm-example.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/10/03/iterating-some-sample-data/llm-example.png&#34;
         alt=&#34;A stacked bar chart&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;Ordered and stacked bar chart of imaginary LLM performance.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
</description>
    </item>
    
    
    
    <item>
      <title>The Road to Selfdom</title>
      <link>https://kieranhealy.org/blog/archives/2025/08/21/the-road-to-selfdom/</link>
      <pubDate>Thu, 21 Aug 2025 07:38:35 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/08/21/the-road-to-selfdom/</guid>
      <description>&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/08/21/the-road-to-selfdom/ramac.jpeg&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/08/21/the-road-to-selfdom/ramac.jpeg&#34;
         alt=&#34;RAMAC 305 brochure cover&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;Marion and I have &lt;a href=&#34;https://aeon.co/essays/the-sovereign-individual-and-the-paradox-of-the-digital-age&#34;&gt;an essay in Aeon&lt;/a&gt; today:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;What is happening here is more than an abstract flow of information. It is more than a means of surveillance. It is more than a price mechanism. Rather, it’s as if the air traffic control and insurance commission functions of the IBM 650 have been fused, shrunk, and wholly generalised. This is the real computing revolution. Much of what we do is immediately authenticated as we do it, stored as data, classified or scored on some sort of scale, and deployed in real time to modulate some outcome of interest – usually, the behaviour of a person, or a machine, or an organisation. &amp;hellip; Because of this transformation, our sense of who we are is assembled in a strange and tangled fashion. The machinery of ordinalisation attends carefully to individuals rather than coarse classes or groups. By doing so, it appears to liberate people from the constraints of social affiliations and to judge them for their distinctive qualities and contributions. It promises incorporation for the excluded, recognition for the creative, and just rewards for the entrepreneurial. And yet this emancipatory promise is delivered through systems that classify, sort and, above all, rank people with ever-greater precision and on a previously unimaginable scale. The resulting social order is a sort of paradox, characterised by constant tensions between personal freedom and social control, between the subjective elan of inner authenticity and the objective forces of external authentication. It gives rise to a certain way of being, a new kind of self, whose experiences are defined by the push for personal autonomy and the pull of platform dependency.&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;You should &lt;a href=&#34;https://aeon.co/essays/the-sovereign-individual-and-the-paradox-of-the-digital-age&#34;&gt;go check it out&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    
    
    <item>
      <title>Blueberry Hill</title>
      <link>https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/</link>
      <pubDate>Thu, 07 Aug 2025 20:07:52 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://www.bbc.com/news/articles/cy5prvgw0r1o&#34;&gt;ChatGPT 5 was released today&lt;/a&gt;.&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;ChatGPT-maker OpenAI has unveiled the long-awaited latest version of its artificial intelligence (AI) chatbot, GPT-5, saying it can provide PhD-level expertise. Billed as &amp;ldquo;smarter, faster, and more useful,&amp;rdquo; OpenAI co-founder and chief executive Sam Altman lauded the company&amp;rsquo;s new model as ushering in a new era of ChatGPT. &amp;ldquo;I think having something like GPT-5 would be pretty much unimaginable at any previous time in human history,&amp;rdquo; he said ahead of Thursday&amp;rsquo;s launch. GPT-5&amp;rsquo;s release and claims of its &amp;ldquo;PhD-level&amp;rdquo; abilities in areas such as coding and writing come as tech firms continue to compete to have the most advanced AI chatbot.&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;I had a chat with it. Yes I know the questions surrounding AI are tricky. But I do not pretend to address those here; I merely report the following conversation.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;The thesis outlined.&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-1.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-1.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;&lt;em&gt;The thesis patiently explained, defended in the face of objections, and further elaborated.&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-2.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-2.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;&lt;em&gt;The collegial chat and a final effort at rebuttal.&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-3.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-3.png&#34;/&gt;&lt;/a&gt;
&lt;/figure&gt;
&lt;p&gt;In fairness to GPT5, in my career I have indeed encountered PhDs with this level of commitment to their particular blueberry. And many have also had that blithe confidence &amp;mdash; the use of &amp;ldquo;Ah&amp;rdquo;, the &amp;ldquo;Let&amp;rsquo;s slow it down&amp;rdquo; (to your two-B level), the &amp;ldquo;Exactly&amp;rdquo; (Now you see my genius), the confidently colloquial &amp;ldquo;Yep&amp;rdquo; and &amp;ldquo;Nope&amp;rdquo; &amp;hellip; actually I retract my earlier skepticism; the lad has the makings of a fine philosopher.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Very quick update:&lt;/em&gt; Posting this on social media led to some interesting responses. While certainly well-meant, I’m afraid I can’t say I am especially impressed by helpful suggestions that I spend time to do a little prompt engineering in order to get GPT5 to emit the right answer to the question “How many times does the letter b appear in blueberry”. Meanwhile, my excellent student &lt;a href=&#34;https://acastroaraujo.github.io/blog/&#34;&gt;Andrés&lt;/a&gt; got a different, and correct, response from his attempt. So score one for replicability, I guess.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update 2:&lt;/em&gt; Look, these tools are amazing in lots of ways. But if you’re gonna pitch and sell&amp;mdash;or foist&amp;mdash;them on the world at large, or your employees, by saying “This is your PhD level genius expert buddy you can ask things in natural language”, don’t get all pissy when it comically bombs tasks people reasonably think of as trivial.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update 3, now in meme form&lt;/em&gt;:&lt;/p&gt;
&lt;figure&gt;&lt;a href=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-4.png&#34; data-fancybox&gt;
    &lt;img src=&#34;https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-4.png&#34;
         alt=&#34;I don&amp;rsquo;t think you get to have it both ways. That is, you don’t get to, as it were, borrow charisma from all the hype and then disavow every failure to live up to it as someone else’s naive mistake for believing the hype.&#34;/&gt;&lt;/a&gt;&lt;figcaption&gt;
            &lt;p&gt;I don&amp;rsquo;t think you get to have it both ways. That is, you don’t get to, as it were, borrow charisma from all the hype and then disavow every failure to live up to it as someone else’s naive mistake for believing the hype.&lt;/p&gt;
        &lt;/figcaption&gt;
&lt;/figure&gt;
</description>
    </item>
    
    
    
    <item>
      <title>The Sound of Silence</title>
      <link>https://kieranhealy.org/blog/archives/2025/07/22/the-sound-of-silence/</link>
      <pubDate>Tue, 22 Jul 2025 07:36:20 -0400</pubDate>
      
      <guid>https://kieranhealy.org/blog/archives/2025/07/22/the-sound-of-silence/</guid>
      <description>&lt;p&gt;A &lt;a href=&#34;https://github.com/openai/whisper/discussions/2608&#34;&gt;GitHub Issue&lt;/a&gt; on OpenAI&amp;rsquo;s &lt;a href=&#34;https://github.com/openai/whisper&#34;&gt;Whisper&lt;/a&gt;, which is a good speech-recognition and transcription model with support for a large number of languages. A lot of people use it. The issue:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;Complete silence is always hallucinated as &amp;ldquo;ترجمة نانسي قنقر&amp;rdquo; in Arabic which translates as &amp;ldquo;Translation by Nancy Qunqar&amp;rdquo;&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;In the comments, people note that this class of error has been known for a while and there are equivalents or counterparts in other languages:&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;I found a similar thing happens in German where it says &amp;ldquo;Untertitelung des ZDF für funk, 2017.&amp;rdquo;&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;That would be &amp;ldquo;Subtitling by ZDF&amp;rdquo;. Although another commenter notes &amp;ldquo;In german it&amp;rsquo;s &amp;lsquo;Vielen Dank&amp;rsquo; (Thank you very much)&amp;rdquo;. In Romanian,&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;i’ve noticed multiple instances where the transcripts ends with “nu uitati sa da-ti like si subscribe” which, as you might easily infer , translates to “don’t forget to like and subscribe”.&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;You can see what&amp;rsquo;s happening. The model learns that silence is the end of the recording. As &lt;a href=&#34;https://github.com/openai/whisper/discussions/2608#discussioncomment-13842561&#34;&gt;KillerX says&lt;/a&gt;,&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;this seems to be an artifact of the fact that Whisper was trained on (amongst other things) YouTube audio + available subtitles. Often subtitlers add their copyright notice onto the end of the subtitles, and the end of the videos are often credits with music, applause, or silence. Thus whisper learned that silence == &amp;lsquo;copyright notice&amp;rsquo;.&amp;quot;&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;Amusingly,&lt;/p&gt;



&lt;blockquote&gt;
    &lt;p&gt;In English there is always applause&lt;/p&gt;

&lt;/blockquote&gt;

</description>
    </item>
    
    
  </channel>
</rss>
