November 10, 2025

Trustworthy Data Visualization

This past September I gave the closing keynote at posit::conf; it’s now on YouTube to watch. Keen-eyed observers will note from the title that it’s about trustworthy data visualization. But it’s also about trust a bit more generally, and how we should think about it in a world where researchers are faking results, AIs are enthusiastically confabulating, and government is destroying data infrastructure. When you find yourself giving a talk with a little tiny microphone stuck to the side of your head you have to ask yourself some hard questions, but the talk was partly about that. ❧ Continue reading…

November 6, 2025

Mamdani vs Sliwa and Cuomo

Mamdani’s victory in the New York City mayoral election gave me the opportunity to draw a few maps, and also to learn a bit about incorporating additional spatial data into maps drawn in R. R is not a specialized piece of GIS software. ESRI’s ArcGIS is the 800lb gorilla in this world and QGIS the GIMP to its Photoshop, so to speak.

Still, you can do a lot of spatial stuff in R, grounded in the sf package and its many friends. Plus you get the benefit of all the data manipulation and analysis that R is really good at. So, having gotten the precinct-level results for the election, some maps from New York City (e.g., the clipped borough boundaries map), and GTFS data from the MTA describing the structure of the subway system, I was able to draw some things. I strongly approve of the existence of the GTFS, by the way. It’s a spec for encoding transit data and lots of cities use it. Really handy. ❧ Continue reading…

October 28, 2025

GSS Release

GSS immigration question

Trends in the immameco question.

Release 2 of the 2024 GSS cross-section and 1972-2024 culumative data are now available. I’ve updated gssr and gssrdoc to incorporate them. There are quite a few changes in the data and variables, thanks in part to some changes in data collection methods and a privacy/disclosure review.

The gssr and gssrdoc packages are the nicest way to get General Social Survey data up and running in R. The figure above shows (survey-weighted) trends derived from the immameco question. ❧ Continue reading…

October 25, 2025

Manhattan Plot of Manhattan

Skyline plot

Here I continue my efforts to design visualizations that are as poorly-suited as possible to being displayed on phones. It looks pretty good on a big monitor, or six feet wide on a wall.

I made a version of this plot a few years ago. I ended up revisiting it this morning because I’m updating various datasets and code. A Manhattan plot is a term sometimes used to describe a kind of scatter plot where the x-values are fairly continuous, and the y values have distributions with long tails, so the plot looks like a skyline. This one here is a bar chart rather than a scatter plot but it’s still a kind of Manhattan plot of Manhattan. ❧ Continue reading…

October 19, 2025

gssrdoc Updates

Regular readers know that I maintain gssr and gssrdoc, two packages for R. The former makes the General Social Survey’s annual, cumulative and panel datasets available in a way that’s easy to use in R. The latter makes the survey’s codebook available in R’s integrated help system in a way that documents every GSS variable as if it were a function or object in R, so you can query them in exactly the same way as any function from the R console or in the IDE of your choice. As a bonus, because I use pkgdown to document the packages, I get a website as a side-effect. In the case of gssrdoc this means a browsable index of all the GSS variables. The GSS is the Hubble Space Telescope of American social science; our longest-running representative view of many aspects of the character and opinions of American households. The data is freely available from NORC, but they distribute it in SPSS, SAS, and STATA formats. I wrote these packages in an effort to make it more easily available in R. If you want to know the relationship between these various platforms, I have you covered. But the important thing is that R is a free and open-source project, and the others are not. ❧ Continue reading…

Categories

Sociology (512) · Politics (345) · Misc (333) · Internet (143) · Visualization (119) · R (105) · Books (100) · News (97) · Data (88) · IT (59) · Gender (58) · Philosophy (58) · Economics (54) · Nerdery (53) · Obiter Dicta (39) · Orgtheory (29) · Apple (18) · Teaching (17) · Emacs (6) · PGR (6) · Gss (3)


All Posts

2025

November 10   Trustworthy Data Visualization · November 6   Mamdani vs Sliwa and Cuomo · October 28   GSS Release · October 25   Manhattan Plot of Manhattan · October 19   gssrdoc Updates · October 13   Parking Signs · October 8   Halloween in the Round · October 3   Iterating some sample data · August 21   The Road to Selfdom · August 7   Blueberry Hill · July 22   The Sound of Silence · July 9   Embeddable Mac · June 28   American · June 26   Razor, Gun, Fence · June 18   Oh Leave it Out · June 9   LA County Population · February 20   TSA Screening Volume and Epiweeks · February 19   MTA Ridership · February 16   Burn Notice · February 6   Kerning and Kerning in a Widening Gyre

2024

October 12   Halloween Data Cleaning · September 6   Dr Drang and the Electoral College · July 17   Apple's First Post-Taboola Event · June 1   A New York City Adults and Children Dotmap · May 31   A New York City Race and Ethnicity Dotmap · May 30   A Population Dotmap of New York City · May 29   Race and Ethnicity in New York City · May 20   Harrison White 1930--2024 · May 16   New York City's POC Population · May 8   Inspirational Quotes · April 16   Six to Ten Hours of Poly-Processing · April 15   gssr is now two packages: gssr and gssrdoc · April 12   Daily Average Sea Surface Temperature Animation · April 9   The Eclipse via Satellite · April 4   Make Your Own NOAA Sea Temperature Graph · April 1   gssr Update · March 28   Book Day · March 14   Pi Day Circles · March 3   A PCoA of New York City Neighborhoods and Street Tree Species · February 29   New York City's Street Tree Species · February 29   Street Tree Diameters and Income in New York City Neighborhoods

2023

December 21   The Ordinal Society Site · December 20   The Baby Boom Again · December 6   Dorling Cartograms · December 2   gssr Update · August 10   Flipbookr for Quarto · June 19   The Naming of Stats · May 10   Free Speech Tsar · March 30   Assault Deaths in the OECD 1960-2020 · March 29   Life Expectancy and Health Spending in the OECD · March 25   Reading Remote Data Files · January 8   Escaping the Malthusian Trap

2022

July 22   Unhappy in its Own Way · June 29   Skyline Timeline · June 24   New York Building Ages · June 23   Manhattan Building Heights · May 20   Every Springer Math Text · May 11   Academia Explained · April 27   Map and Nested Lists · April 10   Indexing Iterations with set_names() · April 8   Iterating on the GSS · February 15   Clustering Pundits · February 14   Desktop Mac

2021

December 19   Comparing Distributions · October 30   The Polarization of Death · October 21   Excess Deaths in 2020 · October 9   Building a PDP-11/70 Kit · September 3   Covid Trajectories · May 4   Map, Walk, Pivot · May 2   Contributions to the Literature · February 24   Excess Deaths February Update · January 26   Income and Happiness · January 8   What Happened?

2020

December 18   Cross National Death Rates · October 10   Excess Deaths Overview · October 8   Excess Deaths by Jurisdiction · October 6   Excess Deaths by Cause · October 1   Walk the Walk · September 26   National Weekly Death Rates · September 24   US Excess Mortality · September 14   Dataviz Interview · August 25   Some Data Packages · June 3   The Politics of Disorder · May 23   Get Apple's Mobility Data · May 21   The Kitchen Counter Observatory · May 9   Covid Concept Generator · April 28   New Orleans and Normalization · April 23   Apple's COVID Mobility Data · April 16   Upset Plots · April 10   Covdata Package · March 28   This Is Just to Try to Say · March 27   A COVID Small Multiple · March 21   Covid 19 Tracking · March 15   U.S. Census Counts Data · March 14   Animating U.S. Population Distributions · March 7   This Be the Kirsch · March 5   Spanish Flu · February 26   A New Baby Boom Poster · February 18   Dataviz Workshop at RStudio::conf

2019

November 10   Cleaning the Table