Monday, May 2, 2011

Economic History

The Berkeley Economics PhD program is pretty unique in requiring that first year students take economic history as part of the core. The other core classes (macro, micro, metrics) involve lots and lots of problem sets. If I had been blogging the past year, about 95% of my posts, like 95% of my conversations, would have consisted of griping about "p-sets." (I'm surprised I still have any friends other than my classmates!)

History is a welcome difference. Instead of problem sets, we have weekly memos and a research paper. Professors Eichengreen and DeLong post a memo question each week related to that week's assigned readings, and often also related to current issues. The reading list is more thematic than chronological, and covers a lot of classics. Walter Bagehot's Lombard Street gets a week, as does the gold standard; the Great Depression gets three. Plenty of context for comparison with the past few years. Berkeley professors are pretty well represented in the readings, which I like. It fosters school spirit in a way. The lectures are once a week for two hours, and are a lot of fun. On the first day I had to sit on the floor because the room was so full, but attendance fell a bit after that. They were a nice break from the math-class-style lectures for the other courses, and helped put things in context.

I also really enjoyed writing the memos. We were directed to "flex our prose-writing skills," which I took to heart (although, my grades were always negatively correlated with the extent to which I did so; my driest papers got the better scores).

The main assignment is the history research paper. It is due on May 4 and I think I've finally finished. Proposals were due before spring break, and the end project is nothing nearly resembling what I proposed. I wrote partial drafts of several different papers en route to arriving at my eventual topic. I knew I wanted to do some sort of quantitative text analysis, and first thought of an idea related to geography-- an over-complicated scheme relating place mentions to interest rate and wage convergence. Then I saw a paper by one of my macro professors, Yuriy Gorodnichenko, that included a content analysis of speeches by recent Federal Reserve chairmen related to price-level targeting vs. inflation targeting. I got excited about the idea of Federal Reserve content analysis, especially because after really enjoying Professor David Romer's econ 202A course I became a big fan of the "Romer and Romer" papers and the narrative approach to fiscal and monetary policy history.

So I spent most of Spring Break (except for a lovely weekend in LA) learning to program in Python and writing some code using the Natural Language Processing Toolkit to be able to do text analysis of Federal Reserve documents. I ended up being able to do some pretty cool types of analysis, including some things in the style of Google Ngrams, but for .txt documents or html pages. Then I thought the hard part was done-- I would just "run the code" and have my results. Sigh. So naive.

The trouble was deciding what documents to analyze and what to analyze them for. I met with Prof. Gorodnichenko, who suggested I try to study the perceived tradeoff between inflation and unemployment. I wasn't entirely sure how to go about doing that, but decided to start trying things out with some historical Fed documents. Problem: lots of the historical documents (e.g. this one) are in secured pdf format. That means you can't save them as a text file or copy and paste the text from them into a text file. Which means you can't do anything with them except read them by eyeball. I figured that Prof. Romer might have access to unsecured pdfs and went to meet with him, but he didn't either.

Now, I was pretty aggravated not only because I couldn't run my code, but also because I used to do research related to people with disabilities and accessible digital technology, and know that secured pdfs are far from accessible. They are incompatible with screen readers, for example. And yet the Federal Reserve website claims:
"The Board of Governors of the Federal Reserve System is committed to making its website accessible to all users and ensuring that it meets or exceeds the requirements of Section 508 of the Rehabilitation Act.

The majority of documents on our website are presented in PDF, HTML, or plain text format. When documents are provided in multiple formats, at least one version is designed to be accessible to users of assistive technology."

The Freedom of Information Act, which requires public documents to be disclosed, has only nine exceptions and I can't see how Fed historical documents fit under any of the exceptions.

Being in Berkeley has taught me a thing or two about righteous indignation, so I called the FOIA office and explained what I was after, using every bit of Southern charm I could muster. They promised to get back to me in two days. Guess what? They didn't. So I called back and used my Berkeley voice and they took down my email address. At first I didn't get the email because they misspelled Berkeley, but then I finally got an email, which consisted of a link to the exact webpage where I got their phone number, and instructions that I could fill out a FOIA request but they were really backed up so I should expect a long wait.

I knew I wouldn't get any results in time to do my history paper, so I had to come up with a different topic. At this point I'm debating the wisdom of FOIA-ing the Fed as a first year grad student. A FOIA request would be more or less a matter of principle at this point. Maybe for future research too, but mostly I want them to put accessible versions of all the public documents on the website. I'm leaning towards doing it. Anyone have advice?

For awhile I got sidetracked into making fancier and fancier Python code with less and less clarity about what to use it for. But I happened to notice a very simple, but very striking, result. The frequency of mention of the word "unemployment" in the New York Times is highly correlated with the actual unemployment rate, and the relationship is stable over time (I started in 1914, before unemployment statistics were even published). I was reluctant to not use the more "high tech" functions I programmed, but could tell that this result would make for a better, less convoluted paper. I hope I was right! Here's my favorite line:

If it’s hard to know what people know, and harder to know what people knew, it’s hardest to know what people knew but didn’t know they knew.

It's my first shot at independent economic research, and I'm pleased with how it turned out and glad I enjoyed the process. I'll be anxious for feedback from my professors and peers.


  1. Hi Carola,

    I am so sorry about your elbow, but I am glad about your writing. I know is probably too late, and not what you are trying to do with a FOIA request, but this blog post explains a way to OCR protected PDFs

    You are probably the only person in the Universe who spends a spring break learning python, but as a fellow scholar I understand that you probably had a lot of fun figuring out your language analysis algorithms.

    Good luck with the end of this semester!!!


  2. Thanks, Mauricio. I wonder if I would get in some kind of trouble if I were to unprotect some Fed documents and then publish the results!

    I turned in my history paper today and ended up using New York Times articles instead.