News Apps Dev Opening!

The Texas Tribune’s news apps team is looking for its newest member — a developer in charge of maintaining our flagship explorer news apps. You’ll be joining the team that creates news apps for and you’ll be the go-to person for our salaries, prisons, campaign finance and public schools explorers (to name a few).

Our explorers house everything from hundreds of thousands of state salaries to extensive campaign finance data to records on 8,000 public schools. You’ll be tasked with corralling records from hundreds of data sources and streamlining the process of keeping these apps up to date. This is a tough job involving open records requests to hundreds of organizations and departments statewide. The ideal candidate will always have an eye out for tasks that can be automated, but won’t be afraid to pick up the phone.

The Texas Tribune was founded on a bet that Texans wanted a non-partisan, authoritative source for news and information about Texas politics and policy — and that they love data. At just over four years in, news apps have been at the heart of our success.

We’re a Python shop (Django, specifically) on the back end with modern JavaScript on the front end (Backbone, Underscore, and D3 with a firm love of the event loop). You should be unafraid of large, unwieldy data sets and comfortable writing scripts to handle those transformations because we show our work. The person who fills this job is going to live on the command line and in a text editor while they get acquainted with the mountains of data we process.

The Tribune is a place where people love their jobs, are committed to the mission and are excited to come to work every day. Think you might be a good fit? Please take a few minutes and fill out this form. We look forward to hearing from you!

Area nerd finds cool data difficult to put on map

Originally posted on Dan’s personal site, which features an embedded interactive map.

Interning at the Texas Tribune has been a blast because the newsroom is comfortable working with data. I’ve enjoyed working on a few projects with managing editor and criminal justice reporter Brandi Grissom, and recently she approached me with an awesome dataset.

For the last two years a state law has required Texas county jails to file monthly reports of how many undocumented immigrants are housed with Immigration and Customers Enforcement detainers and how much money is spent housing the immigrants. Brandi filed a request to the Texas Commission on Jail Standards and received PDFs of these ICE detainer reports for each month. With the help of Tabula I had a database of county jails and how much they were spending on undocumented immigrants, which was exciting because immigration is one of my biggest interests.

My first instinct was to use this awesome dataset to make a cartogram - LA Times style - resizing Texas counties by how many undocumented immigrants are housed in their jails.

But the data quickly told me this wouldn’t be the best approach for this story.

Outliers were the problem. Harris County, which includes Houston, housed way more undocumented immigrants than any other county jail. Some quick charts in R demonstrate how crazy it is to compare counties:



It was clear to me that a visualization with Harris County on the same scale as other Texas counties would do a disservice to the data. A cartogram could show the disproportion, but it would be difficult for users interested in smaller counties. Accessibility was a priority because this little-known dataset hadn’t been covered in many other places and we wanted show users how much their county jails are spending on undocumented immigrants. The newsworthiness of the data encouraged us to publish quickly, but that meant I had less time for design.

I was recently admiring the Chicago Tribune medical costs database that lets users get to the information they want in what seems like a huge dataset. With the help of Ryan Murphy’s TableSift.js, I made a sortable table to let users find their county jail and see totals for the two-year we analyzed. Brandi advocated a visualization of prisoners and costs over time, so I added a simple sparkline plugin.

Check out Brandi’s story here and the chart here.

I was still curious what a cartogram would look like, so I spent the weekend fiddling with Raphael.js and d3.js. Anthony Pesce’s shp2svg tool made it easy for an SVG noob like me to get the data on a web page. The map on my site shows Texas counties sized by how many undocumented immigrants in housed in their jails in 2012 using d3 linear and exponential (totally misleading, but I like the animation, ok?) scales.

I also gave a cartogram program called Scape Toad a whirl:


We made the ICE detainer reports available to download, check it out! It’s a neat dataset and I’d love to see if people have other viz ideas.

Friends With Benefits: Adapting MoJo’s CYOA App

By Amanda Krauss and Becca Aaronson

Amanda says:

I think Mother Jones's choose-your-own-adventure (CYOA) interactive flowcharts are the perfect way to make serious issues fun, engaging, and a little less scary for everyday readers. After reading their first CYOA story (the hilarious Slut Quiz) I was excited to see that MoJo’s dev team had released a CYOA plugin for general use, and totally geeked to meet creators Ben Breedlove and Tasneem Raja at this year’s NICAR, where Texas Trbune reporter Becca Aaronson and attending a live workshop on how to make a CYOA.

Becca says:

Only a total data nerd would get as excited as I did watching Tasneem demonstrate MoJo’s Choose Your Own Adventure (CYOA) app at NICAR last year. What immediately struck me was how easy it would be to adapt the code for our website.

As a journalist, not a programmer, I’ve learned to code because sometimes coding is necessary to create the stories that I want to tell. In addition to working on the NewsApps team, I’m also the Trib’s health care reporter, so there isn’t always time for me to devote to coding. (You may have noticed we’re rolling out this insanely complex federal health care law in a state filled with leaders that oppose it.) That’s why applications like MoJo’s CYOA that are easy for journalists to adapt and change to fit their storytelling needs are so amazing. It’s also easy for us to forget that learning how to code can be really intimidating for journalists. CYOA is a great icebreaker to get journalists to realize how easy it can be to create some news apps.

Amanda says:

Becca wanted to make her first CYOA interactive to help people explore Medicaid expansion, and I helped her with the code. After restructuring and styling the CYOA app to be responsive and mobile-friendy for our site, I contributed to the main project repository, meaning that I proposed adding the changes and they were accepted. That was pretty exciting. Recently, Becca decided to make another CYOA to help Texans understand their options under the Affordable Care Act, and that gave us another opportunity to experiment with Tribune-branded styling and better performance.

Becca says:

When we attended the workshop, Medicaid expansion (or lack thereof) was a huge issue in Texas, and a lightbulb just went off in my head that the application would be perfect for creating an interactive quiz to help people determine where they really stood on the issue. Many proponents of Medicaid expansion crossed political party lines, but because Obamacare was so politicized many people outside the health care community didn’t think about the issue enough to really figure out whether they thought Medicaid expansion was a good idea or not. So, Amanda and I adapted the CYOA code to fit our Tribune style and created a quiz for people to figure out where they really stood on the issue.

So how easy is it to create a CYOA app? Let’s walk you through the process.

First, you need a good idea. The CYOA app is basically a flow chart, so it’s useful for showing many answers to a question. For our most recent CYOA app, the question was basically, what’s my best option for complying with the health insurance mandate in 2014? There are lots of different answers, depending on the person and situation, so the topic lent itself really well to a CYOA app.

To get started, I recruited our intern Edgar Walters to help me research the many answers to that question. We sat down and brainstormed the best way to direct people to the right outcome. Here’s a picture of our first draft:


A good start, but we needed to flesh out the details. Here’s the second draft:


You’ll notice that it’s really just a flow chart. The next step is to move the flow chart from paper to a spreadsheet. While the spreadsheet may look complicated, it’s really not. The column headers are called from the Javascript, so keeping the column headers consistent is important.


slug: The unique ID for a slide.

text: The text that you want to appear on the slide. i.e. “Do you currently have health care coverage?”

connects to: A list of the unique IDs or slugs for the next slides that you want to send the user to. They should be separated by a “|”. Remember, you’ll have to create a new row for each “slug” that you list.

connects text: The text that you want to appear for the person to select that will lead them to the next slide. For example, “Yes” or “No.” These should also be separated by “|” and in the same order of the list of slugs that you want the person to be connected to.

background image: A hyperlink to an image or gif. MoJo has done some really hilarious CYOA apps using this field to add funny gifs in the background. But for something serious like health reform, we chose to keep it simple and left this blank. Works either way!

sourcing: Text to show the source information on the slide. Not necessary, if you want to leave this blank.

source link: A hyperlink to the source information. Also, not always necessary.

So, what’s that look like? Here’s an example from the Medicaid Expansion CYOA spreadsheet:

The CYOA app loads the information directly from Google spreadsheets using Tabletop.js. So when you’re finished inputting your flow chart into the spreadsheet, all you have to do is publish it to the web (Under File>Publish to the web.) Also remember to make sure your share setting is set to public! Once you’ve published, you’ll see that Google has given your spreadsheet a unique “key.” The key for our spreadsheet is highlighted in the picture below. Grab and copy your “key,” because you’ll need that for your code.

Amanda says:

You’ll need to grab the code from the MoJo repo.  You’ll need an index HTML page, and you’ll link the Javascript and CSS files to that. Most of the initial debugging involves making sure all the files are correctly linked - if it’s not working, just look in the console to see what’s up.

The CYOA code is pretty simple. The HTML on the main page looks like this:

<div class="cyoa_wrapper">
<div class="cyoa_container"></div>

That’s it! The script (usually located at the top of the page) is what does all the work, grabbing the container

and using it as a base to build the pages from the spreadsheet data Becca described above:

<script src="//"></script>
<script src="//"></script>
<script src="//"></script>
jQuery(document).ready(function() {
var cyoa = jQuery.Cyoa('0AgQB1XZuIQjddDh4NGQ1cXl5WnJzdkR5Tl9teHNIUFE', {
separator : '|',
control_location: 'bottom',
tabletop_proxy: '//'

The first script loads the plugin code for future use. The second script allows your spreadsheet to be imported. The third is a templating library that the plugin uses to make the pages. The last script is the one actually calling the plugin function to build the pages, and it needs all the previous files to be loaded to do its work; you’ll see the key Becca mentioned in it - that’s how it knows where to get the information to build the pages. If all goes well, you should see your CYOA in the browser.

Once you’ve got it working, there’s always some tweaking and debugging to do; this time we changed the CSS to make it look more like the Tribune branding, and fellow news apps developer Ryan Murphy found that we needed to update Tabletop, as well as pointing out this bug, which can interfere with performance.

In conclusion, building a CYOA is relatively simple, a really fun way to tell a story, and a great way to get on board the code train.

Becca says:

What I Learned at The Texas Tribune

By KK Rebecca Lai

1. Newsrooms are fun


Between daily stories, breaking news and data reporting, there was never a boring day at the office. There is always lots of adrenaline going when trying to meet deadlines or coming up with creative solutions to telling a story.

2. Do it the way we’ve done it, unless you can do it lazier


Being a good programmer is about being lazy. (What?!) Our tech team always wants to write more compact code, be more efficient and have our apps run faster. While my code is still clunky, I have definitely become more aware of the need to simplify things whenever I can. Now I try to take the time to take out that extra CSS, or if I’m already loading JQuery, I take full advantage of it so I can write shorter code.

3. Always be scrapin’


Being proactive in looking for data helps create more comprehensive stories. I scraped streamflow data for a story about farmers restoring dry crops, exonerated prisoners data. The data made the stories more interesting and the reporting more in-depth.

4. Reporters are my friends


At the beginning of the summer, I rarely interacted with reporters. But later I realized the writer is always your best source of information when trying to create interactive components to stories. Sitting down with reporters to talk about ideas, data sources or presentation made my job much easier and the interactive components I made much more well correlated with the stories they go with.

5. Editors are also my friends


It is sometimes frustrating for me when editors look at certain projects and want to change things I’ve worked on for a while. But having a fresh set of eyes and another person to test out my projects has always helped me find loopholes or things I can make better. One of my favorite maps about charter schools in Texas went through dramatic changes after the editor looked at it, but is also one of the stories I got the best feedback from.

6. Ask Google First


To not always bother the tech team with questions, I always try to ask the Internet first. It’s much easier to find a comprehensive explanation without wasting someone else’s time.

7. Ask questions


While I try to Google everything, sometimes I jump right to asking the tech team. I could be stuck looking up something in Django for an hour, or get the answer in two seconds. The people standing around me have built the system that we are working on from scratch, so who better to ask than them?

8. Don’t work in a bubble


While I enjoy being plugged in and focused on what I’m doing, I try to unplug once in a while and have other people look at my projects. By this I don’t just mean talking about code, I mean talking through the project with reporters and editors as well when you feel stuck or frustrated.

9. It’s ok to have fun sometimes


At the start of my internship, I avoided going on Facebook or Twitter too often in fear of my boss thinking I’m lazy or unproductive. But it became unhealthy to just be working the whole day without any entertainment. When I’m stuck on a problem with no obvious solution, I stop to read blogs, chat with people or make some GIFs. Coming back to my code refreshed sometimes help me spot problems that I’ve been stuck on for hours.

10. Goodbye is not really goodbye


I was very sad when I had to leave the great city of Austin and the Texas Tribune. But don’t think this is really goodbye! I’ll see these people from the newsroom at various conferences like NICAR, and we can interact online with GIFs, or maybe someday we’ll end up in the same place.

In other news, here’s a glimpse at the next news apps intern Dan Hill.


Be a Texas Tribune OpenNews Fellow

Great news: The Texas Tribune is one of six newsrooms across the globe chosen to host a Knight-Mozilla Fellow for 2014. You can (and should!) apply between now and Aug. 17.

Why should you join us? The Texas Tribune is right smack in the middle of some of the most exciting news, digital storytelling trends and innovative journalism practices in the country.

Also, we have breakfast tacos.

Torchy's Breakfast Tacos

Developing at the Texas Tribune

The Texas Tribune was founded almost four years ago amid a lot of speculation on whether the non-profit model was a good fit for politics and policy journalism. We bet on the fact that Texans were looking for a non-partisan, authoritative source of news and information — and we were right. Embracing news apps and relying on data as a storytelling tool from the start have been major contributors to the Trib’s success.

2014 is a big election year for Texas: Six of eight statewide elective offices — all currently held by Republicans — have no incumbent. Democrats in this red state are still seeking a toehold. With longtime Gov. Rick Perry’s decision not to run for reelection, the political chessboard is more exciting than ever.

Our News Apps team — and our fellow — will be in the middle of the biggest Texas story of the year, making election results available throughout the state via our interactive election coverage. Think brackets, scoreboards, campaign finance, and whatever else we can come up with.

Data has been in the Trib’s DNA since day one — and we’ll be reinforcing that in 2014. Apply today to join us for one hell of a ride.

Did I mention we have the world’s best brisket?

Franklin Brisket

Photo Credits

Announcing TEC Filing Fetcher: Get Campaign Finance Data Quickly

by Ryan Murphy (@rdmurphy)

In mid-July, Texas’ lawmakers and political hopefuls filed semiannual campaign finance reports, which fired up the 2014 election speculation machine and gave us the opportunity to see the lay of the land post-session.

Here at The Texas Tribune, we already do a pretty good job of keeping a finger on the campaign finance beat – in fact, it’s one of our major news apps. However, there’s one problem.

The Texas Ethics Commission – the state agency responsible for collecting this data and ensuring that politicians do due diligence in getting filings in – actually does a great job of making all of the campaign finance data it has available to the public, both as a zipped download of text files and in Microsoft Access format. But those files aren’t available immediately. Typically, the TEC produces them 1 to 2 weeks after the filing deadline, once it can corral the stragglers and account for the immediate corrections.

For the most part, this is “good enough” for our news app. We wait for the data drop, we run our scripts against it to push new entries into our database, and our app is happy.

But that wasn’t good enough for me.

After witnessing a historic filibuster, learning our governor of more than 14 years would not run again and watching the latest state fundraising powerhouse announce his candidacy for governor, I knew I wanted breakdowns on the numbers for certain individuals as soon as I could have them.

I remembered that the TEC has a text based representation of electronically filed reports, in addition to the generated PDF version that people usually find themselves combing through. When you pop open the text version, you’ll see plenty of commas and immediately think “csv!”, but it’s more a distant cousin of everyone’s favorite data delivery service. (Note the complete lack of a header. And different types of information on different rows.)

But that’s okay! They provide a legend.


So I got to work. I set up a web scraper with the job to simply let me know once something got added to a politician’s page. (It essentially would email me to say, “Hey dude, this page is different, check it out yourself.” Very basic/snarky.)

The end result was the campaign finance analyzer. After I received an alert, I’d grab the new filing ID and pull down that text representation of the filer’s report. After some light massaging/calculations, the numbers would get added to the interactive. Once someone filed, I’d have their data ready to go within 5 minutes. If I had automated this (next time!) it would have been even faster.

So what’s the point of this story? (Talk about the burying the lead.) I’ve cleaned up the convoluted mess I originally hacked together to collect contribution and expenditure data, and I’m open sourcing it. With a simple pip install tecfilingfetcher, you get a command line tool that accepts any TEC filing ID you toss at it and spits back out either itemized contribution or expenditure data. If you’ve ever used csvkit – and if not, you should immediately – you’ll feel right at home. You can even pipe what TEC Filing Fetcher gives you into csvkit functions, if you feel so inclined.

That’s it! Please try and break it. Then tell me about it.

TEC Filing Fetcher on GitHub
TEC Filing Fetcher on PyPI

Brazos River Map: Working together in the newsroom

by KK Rebecca Lai

I recently worked with two reporters to build a map based on the Brazos River discharge to get an overview of water availability around the river’s basin. It took a while before the reporters and I came to an understanding of what data is needed, where it is available, and how it should be collected. Reporters don’t necessarily understand what web technologies are available to them, and I did not do a great job in communicating the extent of tools that the web can offer. To better this communication, I would like to document the data collection process used to build this map.

In creating the map for the Brazos River map, the reporters provided me with a list of stream site numbers along the Brazos River. The stream sites are water stations along the river that collects discharge data. Instead of collecting the data by hand through visiting 80 websites, a python script can go through all the pages and extract the data in a matter of seconds. These python scripts are called scrapers.

At the water watch database by the USGS, you can pass a site number and choose “Stat (download)” for Output and download a CSV with discharge data. There are about 80 stream sites. To collect these files, I wrote a scraper which passes the site numbers as the input. Essentially, the program automatically enters every stream site number to the “site number” field (see below) and downloads the file with the water-discharge statistics of every stream site to my computer.


After collecting the discharge data, we realized that there is no geographical data on the stream sites to plot them on a map. We found detailed information about each stream site on USGS’s site.


I wrote another scraper to get the geographic data. This time I pass each stream site number into the URL: "”. After creating the page, the scraper then collects the latitude and longitude data from the site. The discharge data and location was put together in Excel and then I imported the data into our mapping program, Tilemill, and styled the map.

To provide a bit of context to the river and the river basin, we looked up geographical data on Brazos River and Brazos River Basin. We found this online Geographical Information System (GIS) database by Texas A&M University and downloaded state river and river basin data.

While it might be unrealistic for all journalists to learn a programming language to write scrapers when working on deadlines in the newsroom, a better understanding of the scope of technologies available would provide much more effective communication with the tech team and speed up the data collection process.

Hey, I built a thingy: Announcing xscrolly.js

by Chris Chang

xscrolly.js is a JavaScript library/jquery plugin that makes it easy to write callbacks for scroll events and the DOM elements as they scroll in and out of view.

The world does not need any more jQuery plugins, but I needed something and I could not find a pre-built solution. The specifics around xscrolly’s birth were that I was looking for a way to trigger a callback when an element was scrolled to. I’ve written scrolling callbacks before, and hated it every time. Reluctantly, I set about making yet another JavaScript library.

The first thing I did was to look into hooking into some existing scrollspy code on the page. I immediately gave up.

Then I realized I should start over from scratch and pretend the existing code did not exist. After all, we’re trying to write smaller, self-contained pieces of code here The Texas Tribune now.

I started by describing the problem mentally: I want a script that keeps track of multiple elements on a page, and decides which one is in focus; and when we change focus, I want to be able to run some arbitrary code. With this, I coded the basics. Then, I realized that a lot of the other things we were doing with scroll could be done with xscrolly with very little modification. Fifty very little modifications later… I came up with xscrolly variants of scroll spying nav, fixed header, and lazy loading images.

Of course a few days later, I happened to find a similar jQuery plugin: Waypoints. Now xscrolly’s entire existence was questioned. But after comparing the two, I decided they were different enough to justify keeping and developing xscrolly. The biggest difference is that Waypoints can’t do my original objective: always assume an element is in focus. Waypoints is more mature and has some shortcuts to common things people do with with scrolling.

Even in its early state, it’s been useful. I needed a really quick fixed header on the Voting Scorecard app, and xscrolly came to the rescue. I’m also still in the process of converting that original page I ran away from to using xscrolly, and it’s already working much better than the original JavaScript.


1 2 3