Tuesday, September 30, 2014

An Offer You Can't Not Refuse

On a mailing list, received today:

From: XXX 
Subject: Seeking Technical Co-Founder  
I'm working part-time with XXX, an early-stage start-up providing an activity-planning and metrics-gathering platform for hospitality and tourism organizations. XXX is looking for a technical co-founder to join the team initially in a part-time, equity-based capacity.  The co-founder should have experience with mobile and web development, including in a LAMP environment, and be able to contribute in both hands-on and strategic fashion. 
If you or someone you know might be a fit or you would like more information, you can contact the CEO XXX directly at XXX
Not only will you not pay anything, and not contribute anything, you're still using PHP.  (Actually, more likely, you just don't know WTF a "LAMP environment" is.)

Friday, May 16, 2014

Big Data: It's Not the Size That Matters

Over a year ago, Ari Gesher asked me to sit on a panel at Georgetown Law called "Swimming in the Ocean of Big Data".  After I agreed to it, he sprung on me that being on the panel required me to write 10,000 words for their journal.  So, it finally arrived physically on my desk today (they sent me actual paper copies, how quaint!).  You can find it here.  The editor's abstract:

In this article, the author argues that the premise of “big data” lies not in the amount of data that we can generate, collect or store, but in the ability to use data to make informed decisions. He explores both the needs of data analysis projects and how to apply these types of projects to real-world circumstances. The article concludes that whatever data systems are ultimately put in place, privacy and civil liberties should be a primary concern and not simply an afterthought; the author proposes methods that can be implemented during the design of big data programs in order to curtail privacy violations.

The academic in me finds it hard not to be proud that I have another publication to my name, but the reformed-academic-turned-engineer in me knows that publications are a poor substitute for Actually Doing Things That Make a Difference.  Nice to have it both ways for once.

Thursday, March 13, 2014

You Don't Need a Dashboard

For at least as long as I've been working in data analytics, the clamoring from the datarati for Dashboards! Dashboards! Dashboards! has consistently risen year over year in pitch and volume.  Spotfire and Tableau produce two really popular products, and my company has at least three different "dashboard-style" plugins.  This week I also discovered Kibana, which frankly looks really awesome, because it's free and dead-easy to set up if you already have ElasticSearch (and ES is also dead-easy to set up; however, as I discovered the hard way, beware of multiple people screwing around with ElasticSearch prototyping on the same subnet, as the out-of-the-box configuration will make everybody's ElasticSearch node automagically join into one cluster transparently, with obviously undesirable results.  Data integration!)

The most common thing I hear about dashboards is that people want them to "spot trends".  And, as far as I can tell: No, you don't.  Not really.  One golden truth I have learned from working in data analytics is this: If you cannot pose a concrete question that you would like to answer or a concrete problem that you would like to solve, then you're wasting your money.  Dashboards do neither of these things, at least not in the way most people use them, which is often as command center props.  The process of formulating a question in a way that can be answered using data is not simple, and hence visualizing data in ways that is essentially static will not answer meaningful questions.  And by "static" I don't mean "has no temporal aspect".  Charts that show values over time are still static if the things being plotted cannot be changed easily and intuitively.  And, if you spend a lot of time changing the visualizations and exploring different hypotheses, then you're not using a "dashboard"; think about it: a car dashboard is a thing you look at to get an immediate read on something, like your speed or fuel level.  How often do you switch your speed readout from MPH to KPH?  So, if that's you: congrats!  You're a data scientist!  You should stop thinking about dashboards and start thinking about a real, scalable data analysis platform.

Thursday, March 6, 2014

From the Archives

Note: This is a repost from my old blog, explaining my departure from academia; this came up in response to some recent discussions about a candidate who was deciding between attending CS grad school and getting a job, and how to dissuade them from the former, which led here:
Several people asked me to repost this in a place where they could read it, so here it is.

In case you've been coming here day after day, wondering why I have stopped posting, and where in the world is Carmen Sandiego, I thought I would update my (rapidly shrinking) fan base with my whereabouts: for almost a year, with the prospect of my funding drying up, and no publication in easy sight, I spent some time taking stock of my options, and deciding what I wanted to do with the rest of my life. And, the fact of the matter is: I was bored. I was unhappy. I really think single molecule biophysics is awesome, and fun, and there's great stuff happening. But I had discarded enough plastic pipette tips to last a lifetime, and was finding it increasingly difficult to care about the rate constant for phosphate release of E. Coli RNA polymerase in the presence of XYZ. Additionally, I had grown to really love the bay area, and didn't want to leave my friends, my girlfriend, and my hang glider behind.

My options, as I saw it, were:
  1. Keep on keepin' on, and try to find a professorship at the University of Wallamaloo, or wherever I could, and hope that things would be more fun and interesting as a mid-grade intellectual at a mid-grade university.
  2. Back out, and try to find another postdoctoral appointment doing something completely different, and hope that, four years down the line, at 38, I wasn't totally burnt out on that as well. I considered computational evolution, or even getting a masters in EE or CS, and seeing where that took me.
  3. Get a real job.
I talked to a lot of people, and received some encouragement from some corners, most notably from Ben Ovryn at AECOM who strongly encouraged me to not give up on academic science. Just as conspicuously, I received no such encouragement from my research advisor, who, when I discussed my options with him, basically shrugged.

I applied for biotech jobs, software jobs, and a professorship at City College of New York, because I thought it would be fun to live there and teach there. I considered taking a year off to write a book, or become a professional hang glider pilot, or both. I thought of opening an artisanal sandwich cart in San Francisco with a friend of mine, because, let's face it, you can't get a decent deli style sandwich in the bay area. In the end, I had a job offer from a high flying biotech startup that would have required me to move to the east coast, and a job offer to work at a friend's software company, and I chose the latter.

So, this is where I am. I'm currently employed as a Forward Deployed Engineer at Palantir Technologies in Palo Alto. The software is incredible, the people are amazingly smart and fun, and my group is mostly comprised of Ph.D.s who left science to try something else, and wound up here. I do a bit of everything: I project manage, I code, I do some outreach, I integrate data, and I sometimes look for new and interesting ways to use our product. I've been here for four months, and it's been pretty much non-stop excitement.

I've learned a lot. First, that there are a lot of really smart people out in the private sector, if you look in the right places. Scary smart people, the kind that academics will tell you don't work in the private sector. Second, I've realized that many of the dysfunctional relationships I had in the academic world were not actually due to my personality flaws, but were largely due to the peculiar culture that tolerates (and in some cases rewards) dysfunctional interpersonal relationships in academia. It's refreshing to work with people who are smart, engaged, enthusiastic, and who genuinely want to work together to create something worthwhile and powerful. I think, to some extent, the archetypal academic interaction is the pissing contest, where people jockey for status, because status is the only currency in the academic world. All other forms of interaction are subordinate to the pissing contest. It's refreshing to step away from that world. And, third, it took me two or three months out of academia to realize how really bored I was with what I was doing. It's not that I think it's intrinsically boring; it's just that it wasn't really driving me to do more and accomplish more, but I had had myself convinced that this was the way to go, that this was interesting because everybody else said it was. With some hindsight, I can see that if I had found it really that fascinating, I would have been eager to get up and get in there to do more. And there just wasn't that drive, and it was making me miserable.

So, I'm here, I'm finally liking what I'm doing, and I'm liking the people I'm doing it with. I'm getting up in the morning excited to come over here and face the challenges of the day. I'm still advising the grad student who's following up on my work a little bit, and I'm even doing some consulting for a biotech startup in Silicon Valley, just to stay in the game for fun. And, if I ever start to get bored with what I'm doing, I'll remember what it feels like, and I'll do something else. I don't know if I'll keep updating this blog again. Now that I've come back and gotten the long-overdue explanation out of the way, I may just post little sciencey tidbits here and there to amuse myself. We'll see.

Tuesday, January 14, 2014

The Civilized Soldier

"The civilized soldier when shot recognizes that he is wounded and knows that the sooner he is attended to the sooner he will recover. He lies down on his stretcher and is taken off the field to his ambulance, where he is dressed or bandaged. Your fanatical barbarian, similarly wounded, continues to rush on, spear or sword in hand; and before you have the time to represent to him that his conduct is in flagrant violation of the understanding relative to the proper course for the wounded man to follow—he may have cut off your head." --Sir John Ardagh, discussing the necessity of "dum dums" (or expanding bullets) for international warfare, at the Hague Convention of 1899, which subsequently banned their use for international conflicts. Via Wikipedia.

Monday, November 18, 2013

Reflections from Bouldering

I've been spending a lot of time bouldering recently, a few times a week.  Besides strongly incentivizing me to lose 10 lbs, I have started to learn some interesting lessons, the hard way.

Indoor bouldering is like rock climbing, but the highest it gets is about 17 feet, the floors are padded about two feet thick, and there are no ropes.  That means I can show up whenever I want, alone, and climb for as little or as much time as I want, and not need someone to belay me.  It also means that the first few times you climb, it can be pretty unnerving because when you fall, you just fall, boom, onto the mat.  In fact I noticed that as I grew more tired during climbing, if I thought I wasn't going to be able to make it to the top, I would frequently climb or jump down while I still had control, even if I had some power left,  because I wanted to avoid being all the way at the top and having no strength left, forcing me to fall uncontrolled from the top, as opposed to falling in a controlled way from halfway up.  But, after you fall from the top a few times, this turns out to be a mistake: falling from the top doesn't hurt.  That's why they let you do it and don't get sued too often (although, the place is actually blanketed in cameras, in deference to our tort-happy society: just in case someone does something stupid and sues, they have you on record.)

But, the more interesting thing that I discovered was that the barrier to failure was often simply exhaustion rather than skill.  And this has a particularly interesting consequence: often, the best next move is making the next move.  As a beginner, your instinct is to stop at each hold, look around, and see where the next move is.  Which hold can you reach without falling over?  But, watching other skilled climbers in the gym, they do it differently: first, they study the route before they start climbing.  Then, once they're on their way up, they move gracefully and smoothly from one hold to another, and importantly, they keep moving.  While you're stopped, looking around, your arms are growing tired, your tendons are aching, the skin on your fingers is starting to grate under the hand holds.  And what I found, through brutal trial and error, was that I was much more consistently successful if I just kept moving.  In an indoor bouldering gym, the holds are laid out somewhat logically, but also somewhat deviously, so that it's not always obvious what the solution is.  But your brain moves pretty quickly, and without even realizing you're doing it, you're not even considering 9 out of the next 12 possible moves.  The extra 5 seconds that it takes per move to decide amongst those three remaining moves is probably the difference between near-complete exhaustion and complete exhaustion.  And complete exhaustion means failure.  I'm a big fan of stopping and thinking about what you're doing, but the lesson is, when you're resource constrained and time is not on your side, don't think too hard.  You might back yourself into a corner, but it's no worse than falling on your ass.

Friday, November 1, 2013

We Apologize for the Interruption in our Interrupted Programming

Not much call for blogging these days; most of the interesting Data Blogging Topics(tm) have been around the Snowden NSA leaks, and I've been trying to slog through a number of other things at work, so it's hard to find the energy to get invested in it.  But, I wanted to stop by the blog and give you a brief update, to the assembled masses who may read this later (and, I have found out the hard way, blogs left unattended can come back and bite you in the ass.)

When the Snowden/NSA leaks first started coming out, the scope was pretty limited.  Phone call metadata logging was the big topic, and my comments were primarily technical in nature.  My decision to not express a particular opinion on the politics might have been construed as a tacit approval, or at least a lack of outrage, and I think the latter was probably not far off.

In the meantime, however, a lot more things have come out, such as the fact that the NSA has p0wnz0red the entire internet, and we've been eavesdropping on foreign heads of state and American citizens for essentially no reason.  So, I wanted to update, for clarity, my feelings: this is odious, unamerican, and a fundamental breach of the public trust.  The excellent New Yorker article about Alan Rusbridger, the editor of The Guardian, indicates that we're still just seeing the tip of the iceberg; the only limiting factor is how fast the journalists can process and understand the documents they've been given*.  If you want to read more, the incomprable Bruce Schneier is your go to source, and I truly couldn't hope to add anything.

Sadly, in my search for hyperbole to compare this with, my mind goes back only as far as the buildup to the Iraq War, and it's hard for me to draw a comparison really: it's apples to oranges.  The Iraq war was a breach of the public trust in a fundamental way, but it involved a lie which resulted in the deaths of over 100,000 humans.  It's hard for me to draw a meaningful comparison there that doesn't minimize those deaths.  But the consequences of fundamentally weakening the internet is hard to grasp, in both its scope and consequences.  The ripples from this tidal wave will continue to leave marks in the sand well into the next generation, and only time will tell.

*The article refers also to Rusbridger's memoir in which he interleaves his Herculean work publishing the Snowden leaks with his year long struggle to master a particularly difficult work by Chopin.  I immediately thought, "He must not have any children," but of course, we find out, he does.  It is times like this that I am reminded of the late great David Foster Wallce's characterization of Wilhelm Leibniz in one of my favorite books ever written, "Everything And More: A Compact History of Infinity".  He describes Leibniz, one of the inventors of the calculus, as "a lawyer/diplomat/courtier/philosopher for whom math was sort of an offshoot hobby", which he tags, in typical David Foster Wallace fashion, with a footnote, saying only, "Surely, we all hate people like this."