Adventures in data cleaning: Did the New York Times undercount risky San Francisco skyscrapers?

It looks like the New York Times may have undercounted the number of risky skyscrapers in downtown San Francisco — 48 instead of 39. A June 15 story focused on steel moment buildings cited in a USGS report.

I made a quick map using the addresses from the NYT story, then I wanted to make one that included photos of the buildings. This time I went directly to the report, noticing that the first address wasn’t listed in the story, it seemed like a good idea to see if there were any more discrepancies.

TL;DR

To check how I got them:

  • The USGS report, starting on page 360, in .PDF
  • The .KML file I made, for more fact-checking and map making (pretty please send links to your maps or put them in the comments – I’m a casual mapper, using new tools and working quickly!)
  • Here are the additional nine addresses from the report that weren’t in the NYT story:
  1. The Mills Building, 221 Montgomery Street
  2. 225 Bush Street
  3. 140 Montgomery Street
  4. 120 Montgomery Street
  5. 45 Fremont Street
  6. 55 2nd Street
  7. 555 Mission Street
  8. 611 Folsom Street
  9. 680 Folsom Street

The clumsy adventure

To start, I downloaded the 454-page .PDF, then extracted five pages with the buildings listed by using the >Print>Pages>Save as .PDF function in Preview for Mac. Then I converted the .PDF to .CSV with Sejda. After that, it was time for Terminal to merge the extracted data from those pages into one file with the command:

cat *.csv >merged.csv

Still too messy to be useful without a lot of tedious cleanup:

So I tried the quickest and dirtiest way I know: copy the table from the .PDF into Word, then from Word (where it’s recognized as a table) copy it into Excel.

There are a couple hundred buildings listed, but the ones cited in the story are steel moment frames. Erected before a 1994 building code outlawed a flawed welding technique, they harbor particular risk in a quake of magnitude seven or higher.

From the USGS report: Steel moment frame listed as “Steel MF,” “Steel moment frame” and “MF.”

From there it was a question of sorting the buildings listed as “Steel MF,” noting that a couple are listed alternatively as “Steel moment frame” and one as simply as “MF.” Messy messy messy: also, totally typical. (There were also about 15 more listed as Steel MF in combination with some other reinforcement, since it would require more reporting to figure out if they’re as risky, these were left out.)

Then I checked the addresses against the story, added polygons for the nine new addresses to the previous uMap, downloaded it as a .KML file and started playing around in Google Maps.

The resulting map is a little disappointing. For starters, the polygons from uMap (which uses OpenStreetMap) don’t jibe that well with Google. As for the images – since the real a-ha if you live or work in San Francisco is how many of these buildings you’re in or around – I always forget how bad these are in the noob version of Google Maps. When you’re editing in the map, they are Polaroid-style pop-ups that resize whatever pic you throw in. The published version looks nothing like that and the overall effect with these building shots (all vertical) is horrific. Ugh. There’s no way to resize the window from this version of Google Maps – the alternatives are Google Fusion tables (which wouldn’t solve the problem here since AFAIK it works with points, not polygons) or  programming via the Google Maps API.

Why this happened

So how did the New York Times undercount the number of especially shaky high rises? Going on my experience with newsrooms (long) and with data (short but painful) my first guess is that the USGS mistakenly gave the Times an Excel or .CSV file that was different from what ended up in the final report.

The reporter knew there were enough buildings to warrant a story, somewhere around 40, the graphics person had the file, made the map and those numbers were plugged into the story and fact checked without going back to the published report.

Or there was some glitch between the formats – given how annoying the process of getting information from .PDF into anything – it’s easy enough. Data cleaning is the least interesting, most tedious part of any project. In this case, if I’m right, there are 20 percent more risky buildings than originally reported.

A quickie map of San Francisco’s earthquake prone skyscrapers


See full screen

See full screen – search for San Francisco if you see a world map.

The New York Times recently ran a story about San Francisco high rises – mostly downtown and South of Market – with steel frames that harbor particular risk in a quake of magnitude seven or higher. About 40 of these skyscrapers, erected before a 1994 building code outlawed a flawed welding technique, were cited in an April USGS report.

It’s one of those stories that could’ve used in interactive map at its core, but instead (it’s the news business, kid!) the map was a small, static graphic (see below) and the story ended with a list of the addresses.

Image courtesy NYT.

So here’s a simple map of those 39 steel moment-frame buildings. A few necessary caveats: this is the handiwork of a casual mapper trying out a new tool. I’ve been looking for a way to use OpenStreetMap to make personalized maps and spotted some earthquake maps from the Japanese OSM community with uMap, so it seemed worth a try. It was heavy going for a map made on the fly – the polygon tool was clunky and importing the list as a cleaned up .CSV wasn’t happening.

Still, a few things pop out: A few of these risky buildings are also near construction sites. In OSM, these are shown in sage green. (The light green represents parks.)

The struggle to use the uMap polygon tool is real. This is a closeup of 550 California Street, with a 19-story office building under construction nearby.

The Folsom Bay Tower will be a 39-story, 422-foot (129 m) residential skyscraper.

Park Tower at Transbay will have 43 stories, First & Mission’s Oceanwide Center features 636-foot-tall tower on Mission at First Street and a 910-foot-tall tower on the opposite corner on First Street.

And much like the reporter, shocked to discover the NYT offices are in one of these buildings, there were a few a-ha moments. A family member works in one and I’ve been inside at least a handful recently – an event at Autodesk, a movie at Embarcadero Center, a Wikimedia meetup, met a friend staying at the Marriott, emerged from the Montgomery Street Station in front of one three or four times, etc.

It’s an unscientific sample size of one (well, two if you count the reporter) but would wager that most people who live or work in San Francisco are around, if not inside, these buildings frequently.

How to use mobile app Go Map!! to edit OpenStreetMap

Here’s a quick tutorial for the Go Map!! iPhone / iPad app (v1.5.3) created by Bryce Cogswell and available gratis in the app store.

The short video above shows how to edit in two scenarios – adding information to an entry and adding a point-of-interest. The text below is a mashup of my experience using it and the app’s help section.

Continue reading

Making digital maps with pen and paper: Meet Field Papers

Field Papers is a great low-tech solution for mapping. You chose an area to map, print it, walk outside with the paper copy and mark things up, then scan or take a pic of it with the QR code and it’s added as a layer to OpenStreetMap (OSM). From there you can add your data to the largest public, editable map in the world.

It’s the handiwork of venerable design firm Stamen, who later got together with U.S. Agency for International Development (USAID) for improvements. Because it’s an open-source project whose last major changes were made five years ago and many tutorials showed the previous interface, it seemed like a good idea to test drive it.

The quick slide show above shows how it works — even if you make rookie mistakes like leaving the clipboard in your photo, duh! — the test run taking about an hour total, from figuring out how to position the map to editing in OSM.

It’s been used around the world for large mapathons, where people don’t have smartphones or OSM knowledge — you hand them sheets, they go out mapping, then they hand in the sheets and they’re done. It can be a potential bottleneck for OSM data entry after collection, but surmountable. Potentially it’s also an advantage — you can get a lot of people out mapping but only need a few with OSM knowledge or who want to learn.

Geographer maps San Francisco’s bike politics

Copenhagen has a lot more in common with San Francisco than most people think, says San Francisco State geography professor Jason Henderson.

While many look to the capital of Denmark as a Nordic idyll where the drin of bicycle bells outnumbers the blare of car horns, Henderson says it went through the same political fights to get there. “It’s not a magical unique place, actually, and that opens up the doors to possibility,” says Henderson, who spent a 2016 research sabbatical in Copenhagen and has a forthcoming book about the two cities.

Speaking at a recent Nerd Nite, Henderson gave some gears to grind as San Francisco heads into June 5 elections. Politics matter – how streets are configured, how much car ownership is taxed, how much space is allocated and protected for car parking and who decides these issues – and the daily habits of politicians matter, too.

“It’s important if we’re going to have not just a bicycle city but a truly sustainable transportation city,” he says. The problem? Few San Francisco politicians are really behind the bike as a method of transportation. Continue reading

Why OpenStreetMap matters: Where did Dokdo go?

One of the rocky outcrops under dispute. Photo // CC BY NC

Battle lines have always been drawn over maps. Place names are political, cultural, temporal: from Constantinople to Istanbul and Burma to Myanmar what a place is called matters.
In the digital age, however, you have no idea who is behind the changes and why.  The companies that make the maps millions of people use every day change names following opaque processes that appear to depend on who lobbies loudest at the moment. It’s a strong argument for free, public, editable maps like OpenStreetMap where both the changes and the debate are transparent.

About a week ago, I spotted this poster petitioning Google to put Dokdo back on the map at San Francisco’s Korean American Community Center of San Francisco & Bay Area.

Continue reading

Quick preview of forthcoming book “All Over the Map: A Cartographic Odyssey”

Most of us have swerved a few wrong turns or hacked through some questionable trails and cursed the map. Most of us, though, wouldn’t spend seven years and engage dozens of experts to make a better one.

Then again, most of us aren’t Bradford Washburn. This climb-every-mountain polymath was let down by the sketchy trail maps of the Grand Canyon available in 1969. At the time, age 60 and director of the Boston Science Museum, he knew what made a good map. Washburn was the first climber to scale 20,320-foot Denali and his map of the peak is still considered the definitive map of the region. A pioneer in aerial photography, he’d go on to map Mount Everest and the Presidential Range.

But it’s his National Geographic Grand Canyon map, finally published in 1978, that illustrates his “extreme dedication to the craft of map making” says Betsy Mason, co-author of Nat Geo’s All over the Map blog. Mason previewed one of the 80 stories and showed off some of the 200 maps from forthcoming book she wrote with colleague Greg Miller titled “All Over the Map: A Cartographic Odyssey” at the recent California Map Society spring meeting.

It was the best of crowds (people who readily chime in with the correct pronunciation of “theodolite” and already grasp the merits of hachuring) and the worst of crowds (after lunch on a warm Saturday) but the story behind the Grand Canyon map kept people mostly awake and ready to push over the 45-minute session limit with questions.

Mason and Miller first started the Map Lab blog back at Wired, then moved it over to National Geographic in 2013. Mason, taken with Washburn’s Grand Canyon map the first time she saw it, went archive diving at her new employer’s and found a “huge trove of boxes” about the making of the map.

Photo brewbooks on Flickr. // CC BY NC

Continue reading

Mapping UNESCO’s Intangible Cultural Heritage spots

Looking at the most recent UNESCO Representative List of the Intangible Cultural Heritage of Humanity it’s clear that these “elements,” as they’re called, are all over the map.

There’s painting, weaving, pizza making and spring rituals: but while they offer up videos, photos and text — there’s no actual map of these landmarks in sight.

Making that map shines a spotlight on why organizing data is crucial — and how every organization is a data trove and should be its own best data detective. Plotting visually can inform decision making and highlight patterns – inside trends to be worked into deeper groves or used to recalucate course. The list, according to UNESCO, is “made up of those intangible heritage elements that help demonstrate the diversity of this heritage and raise awareness about its importance.” Continue reading

Five-minute map: San Francisco’s proposed Uber/Lyft loading zones

Update: March 23, 2018. A pilot zone geofencing Lyft drivers from picking up passengers on Valencia Street has been added in the Mission. Source: Examiner.com

If you drive, walk or bike in San Francisco you know what a nightmare the ride-hailing services can be.

And if you use them often you’re probably in the habit of trying to pin yourself on a side street or a big empty parking space/driveway and pray they don’t double park while trying to find you. (Zipping past the anecdotal, it’s been calculated that 45,000 Uber and Lyft vehicles now operating in San Francisco account for more than 200,000 trips a day.)

So now the city is interested in adding ride-hailing passenger pick-up zones in a horse- trading effort to wring more data from these startups.

The San Francisco Examiner reports there are seven proposed “loading zones” and maybe one or two will be piloted. It’s a well-reported story — except that it’s missing a map. The neighborhoods are Hayes Valley, Inner Richmond, Inner Sunset, Noe Valley, North Beach, Marina and downtown.

Five minutes later with Google Maps:

A few things jump out — there’s nothing in the traffic-choked Mission district (see update above) and two “maybes” downtown. (The mapped one on Howard Street above and another potential one left unmapped since it’s described as “between Howard and Third or Fourth streets.”)

Also, once they’re mapped, if you zoom in it’s apparent that the length of these zones varies widely. The North Beach one looks like road rage waiting to happen.

San Francisco does have passenger loading zones already — white curbs with a time limit of five minutes — which in my armchair estimation (and the name “curbs”) says they’re mostly shorter than the approximately 600 feet (two blocks) of the shortest ride-hailing zones in the Richmond and Sunset…

Thoughts?

Full story over at The Examiner.

What’s under the canals of Venice? Old boats, tires and a few surprises

Image courtesy Ismar-Cnr.

Most visitors to Venice drift through the canals on gondolas taking selfies. But a group of researchers spent seven months puttering along pointing high-resolution multibeam echosounders into the waters instead. About 30 of them in all worked aboard the powerboat Litus, intent on mapping the Venice lagoon to gauge the effects of climate change on one of the world’s most improbable cities.

Research boat Litus, courtesy Ismar-CNR

While what’s under those gray-green waters isn’t exactly surprising — boat parts, old tires and containers — scientists say the underwater elevation mapping (that’s “bathymetry,” for the technically minded) comes at a critical time.

Old boats, tires and containers. Image courtesy Ismar-Cnr.

The last 100 years have radically altered the shape and ecological makeup of the lagoon, researchers say: for starters, salt marsh areas shrunk by half and underlying sediment has radically shifted. The “floating city” already struggles to stay above water in the spring and summer floods and relative sea level rise is expected to increase their frequency. The Mose system, with its 78 mobile gates that can hold back almost 10 feet of water, construction launched in 2003 and is said to be near completion in 2018.

Entrance to Malamocco port 1) Mose gate 2) 48-meter (157-foot) trench 3) the oil refinery canal. Image courtesy Ismar-Cnr.

“Before the Mose system begins to function, it was important to have a full picture of the bathymetry and currents of the tidal channels and inlets, which are the most dynamic portion of the lagoon,” researchers say in a paper published in “Nature.” They caution that the relatively rapid erosive process could threaten the stability of the “hard structures” (read: priceless palazzos) in the near future and should certainly be periodically monitored.

If you want to dig into the datasets, the scientists from research groups (Ismar-Cnr and Iim) have CC-licensed and made them available online with the paper.

A scour hole found where two channels meet. Image courtesy Ismar-Cnr.

“The data also allows us to identify areas with large dunes at the bottom and adjacent erosion sites that document the most dynamic points in the deep lagoon, where it’s important to cyclically repeat these studies to quantify the movement of sediments,” head of the study Fantina Madricardo says in the press release (translation mine.)

Part of the reason these Venice maps look so trippy (or alarming?) is due to the city’s curious geography, perching atop 118 islands bridged by canals. On most bathymetric maps, deeper waters are represented by soothing darker shades (green, blue, violet) and warmer colors (red, orange, yellow) represent shallower waters. A bathymetric map of the San Francisco Bay by comparison looks, well, a lot more soothing despite its notorious currents.