Archive for the 'Technology' Category

Different Strokes for Different Folks

Friday, May 27th, 2011

I was a co-founder and CTO of NetGravity back in 1995, and I spent a lot of years building ad servers. So I can say this from experience: ad servers are hard to build. There’s a lot of expertise that goes into delivering the right ad to the right person where response times are measured in milliseconds and uptimes are measured with five nines. Now it’s been almost four years since I co-founded Yieldex, and I can say another thing from experience: predictive analytic systems are also hard to build. Moving, managing and processing terabytes of data every day to enable accurate forecasting, rapid querying, and sophisticated reporting takes a lot of expertise. And, as it turns out, it’s very different from ad server expertise.

When I ran an ad server team, all we wanted to work on was the server. We delighted in finding ways to make the servers faster, more efficient, more scalable, and more reliable. We focused on selection algorithms, and real-time order updates, handling malformed URLs and all kinds of crazy stuff. Our customers were IT folks and ad traffickers who needed to make sure the ads got delivered. The reporting part was always an afterthought, a necessary evil, something given to the new guy to work on until someone else came along.

Contrast that with the team I have now, where we obsess over integrating multiple data sources, forecasting accurately, and clearly presenting incredibly complex overlapping inventory spaces in ways that mere humans can understand and make decisions about. Reporting is not an afterthought for us, it’s our reason for being. Our success critera are around accuracy, response time, and ease of getting questions answered through ad-hoc queries. Our customers are much more analytical: inventory managers, sales planners, finance people, and anyone else in the organization working to maximize overall ad revenue.

These teams have completely different DNA, so it’s not surprising that a team good at one of them might not be so great at the other. This is why so many publishers are unhappy with the quality of the forecasts they get from their ad server vendor, and one of the reasons so many are signing up with Yieldex. Good predictive analytics are hard to build, and nobody has built the right team, and spent the time and effort to get them right for the digital ad business. Until now.

Cloud Computing Lessons

Wednesday, May 6th, 2009

Cloud computing means lots of different things, and much of it is hype. At Yieldex, we’ve been using cloud computing, specifically Amazon Web Services, as a key part of our infrastructure for the better part of a year, and we thought we’d pass on a few of our lessons learned. As you might expect, the services we use have trade-offs. If your challenge fits within the parameters, cloud computing can be a huge win, but it’s not the answer for everything.

All of these lessons are the result of the hard work of our entire engineering team, most notably Craig and Calvin. These guys are among the best in the world at scaling to solve enormous data and computation problems with a cloud infrastructure. We could not have built this company and these solutions without them.

For a startup, there are a number of compelling reasons to use a cloud infrastructure for virtually every new project. You don’t get locked into a long-term investment in hardware and data centers, it’s easy to experiment, and easy to change your mind and try a different approach. You don’t have to spend precious capital on servers and storage, wait days or weeks for them to arrive, and then spend a day or two setting them up. If your application scales horizontally, then you can scale additional customers, storage, and processing with minimal cost and time delay. All these things are touted by cloud providers, and basically boil down to: focus on your business, not your infrastructure.

Sometimes, however, you do need to focus on the infrastructure. We provide our customers with analytics and optimization based on our unique and proprietary DynamicIQ engine. Our first customer was a decent sized web property, and we were able to complete our DynamicIQ daily processing on several gigabytes of data using just one instance in less than an hour. Our next customer, however, was 10x the size. And the one after that, 10x more – hundreds of gigabytes per day. Fortunately, we had designed our DynamicIQ engine to easily parallelize across multiple instances. We spent some time learning how to start up instances, distribute jobs to them, and shut them back down again, but because we had designed the engine for this eventuality, we were able to use the cloud to cost-effectively scale to even the largest sites on the web.

We also have BusinessIQ, which is basically an application server that provides query processing and a user interface into our analytics. Initially we started with this server in the cloud too, but as we bumped up against other scalability issues, we found that the cloud doesn’t solve every problem. For example, we provide a sophisticated scenario analysis capability. To calculate a “what-if” scenario requires processing a huge amount of data in a very short time. For our larger customers, a single cloud instance did not have enough memory to perform this operation. Trying to stay true to the cloud paradigm, we implemented a distributed cache across multiple instances, but this didn’t work well because of limitations on I/O. We ended up having to go to a hybrid model, where we bought and hosted our own servers with large memory footprints, so we could provide this functionality.

We have been very happy users of the Amazon Web Services cloud, and not just because we won the award. We would not have been able to get our business of the ground with out the cost effective scalability of the Amazon infrastructure. While it’s not for every application, for the right application, it truly changes the game.

Yieldex wins Amazon AWS Start-up Challenge!

Friday, November 21st, 2008

We won! Out of nearly 1000 startups who applied, we won!

This is a great validation of our fantastic technical team. We have been chosen as one of the most innovative users of Amazon’s cloud computing technology. We could not have done this without your hard work. Thank you!

This was a great experience for us. The Amazon team was very professional throughout, the event was well-managed, and they even made a cool video featuring our development team. The press release went out tonight, and there was even a blog post that beat mine.

Here’s the blow-by-blow, for those who want all the details:

We pitched the panel of judges, all senior execs of Amazon, at 1pm. They had 50min presentations from each of the seven finalists, and had been going since 7am. We did our standard pitch, and did a great job talking about how important AWS is to us. They seemed to appreciate the presentation, but were somewhat poker faced, so while we felt we did a good job, it was hard to tell their reaction.

Later in the afternoon, we had a “VC speed-dating” event, where we had 10 minutes with each of 5 VCs. The firms were all first-rate (BlueRun, Hummer Winblad, Madrona, Greylock, and CMEA). Our product is pretty complex, so it’s hard to get across in 10 minutes, but we did our best, and each of the VCs seemed to get it quickly enough. All were interested in following up, but again, hard to tell how we ranked.

Then there was a reception while the judges and the VCs deliberated. They had invited 200 other startup people to come hear how the seven finalists were using AWS. My guess is closer to a hundred people were in the room, and we had to do another 10 minute presentation on AWS, with slides, to this group. We managed to do it in only 5-6 minutes and get our message across. Finally, around 9pm, it was time to announce the winner. We were jubilant when they picked us – I let out a shout of joy and a fist pump, to the delight of the audience.

Andy Jassy, the SVP of AWS said some nice words and gave us the traditional golden hammer. We were then invited to take a whack at an old rackmount server they had, to symbolize the destruction of our own servers. John and I both hammered it pretty hard, but we barely dented it – those steel frames are tough.

Then everyone came up to congratulate us, and we shook hands with big grins on our faces. We took a couple of pictures, approved the quote in the press release, and talked with the Amazon folks some more. All great stuff.

Finally, we headed out for a celebratory dinner. Once again, thanks to everyone in the company for your hard work – we did the talking, but we could not have done this without all of you.

Hooray!

Scratching an itch and enhancing CellarTracker

Monday, August 25th, 2008

Once in a while I feel the need to write some actual code. I’m really a programmer at heart, and I find it incredibly satisfying to think through a problem and create a useful solution. For me it’s more interesting than crossword puzzles, and the end product is (sometimes) more valuable.

For a variety of obvious reasons, it’s not a good idea for me to get my fingers into Yieldex production code, so I end up scratching this particular itch with small side projects. This one took me a couple of hours, and hopefully will be useful for enough people for it to have been worth it. In any case, I learned a lot, so it was worth it for me.

I’ve written about CellarTracker before. I’ve been a user for years; I love the site, and I love the business. But, I find some of the UI to be less than perfect. Here’s an example: when I get a wine offering in the mail, there are usually a number of different vineyard designations, and I get an allocation of a few bottles from each. I haven’t found a way in CellarTracker to enter 2 bottles each from 10 different designations without doing an incredible amount of clicking around and waiting for pages to refresh. So, I set out to solve this problem with a “bulk purchase” mechanism.

There are a number of different ways to tackle this problem, so the first thing was to decide on an approach. I’m still a bit 1999 when it comes to the coding for the web, so my first idea was to set up a server-side program. Then I started thinking about the complexity of screen-scraping and parsing, and robustness in the face of potential CellarTracker updates. Then there are the security issues of passing usernames and passwords so my server could log in to CellarTracker. Finally, I realized I didn’t really want to be responsible for keeping the service up and running, so I almost bailed on the entire idea.

Then I remembered GreaseMonkey for Firefox. Cool – an opportunity to enter the 2000’s in web programming, and polish up some JavaScript skills. And, it got around all of the above problems in a neat client-side way. The only real issue is that it works only with Firefox, and for most people would require an install of GreaseMonkey and the script itself.

I started by installing GreaseMonkey and a couple of web development tools, notably the DOM Inspector, the Javascript Shell bookmarklet, and then later the Web Developer Toolbar. I read quickly through the Dive Into GreaseMonkey book, and then just started coding. I was pretty excited that in only a few minutes I could build a script to automatically change a CellarTracker page upon load.

After a couple hours of experimentation, some heavy shell use, and a bit of DOM inspection, I had something up and running. I created a test account on CellarTracker, and entered a bunch of purchases. Success! I finished up by spending a few minutes on data validation, trying to make it obvious when something went wrong and how to fix it.

The best part about this project for me was learning one level deeper about how Javascript and the DOM work. Javascript is a much more powerful language than I remembered from 1999, and I have much better understanding of how Ajax and many modern web sites work. And it was fun.

The final step was posting it in the new/old OSS directory here on Oxyfish, and writing this entry. I also posted a note in the CellarTracker forum, in case anybody else wants to use it. I’m very happy with how it turned out, and am looking forward to the next wine offering in the mail, so I can start saving time.

Install the CellarTracker Bulk Purchase extension (requires Firefox and GreaseMonkey)

Thank heaven for the WordPress backup plugin

Friday, July 25th, 2008

I use the WordPress Database Backup plugin to send automated backups of my blog to a dedicated Gmail account on a daily basis. When the disk filled up on my server, the blog database was corrupted. Fortunately, it was the work of about 10 minutes to restore from the backup. Thank heaven for fire-and-forget backup strategies!

Return of GUID.org

Wednesday, August 8th, 2007

When I rebuilt my old machine about three years ago, I had limited time to get things working, so I focused on the essentials (main web site, email, and blog), and ignored the rest. One thing that fell through the cracks was a site I had been running since 1998, guid.org. From the site:

GUID.org is an Internet service that assigns anonymous random user IDs to web browsers. These anonymous IDs can then be used by other web sites for many purposes. For example, a site may use your GUID to recognize you when you return. GUID.org does not collect or store any information about users – see our privacy policy.

GUID.org was conceived back in 1998 when it was still new technology to insert a “web bug” to correlate users across domains. Now that technology is old hat, but I still think there may be a use for a universal GUID that can be shared by lots of sites.

If anyone comes up with a really great plan for how to use this technology (and domain) in this modern world of internet advertising, I’m all ears. I’m sure there’s a pony here somewhere…

Blog Merge

Friday, August 3rd, 2007

I finally decided that I didn’t need two blogs, so I merged them into one. Blogs do seem to naturally divide into several types, one being the thoughtful occasional post blog, and another being the short blurb with a link type. So, I originally decided to have two different blogs, Tom’s Tech Toys for short links, and Oxyfish for longer items. But managing two blogs is just silly, especially since I don’t post that much anyway. So I merged the other one into this one. Read on if you want technical details.

Read the rest of this entry »

Technical Difficulties

Monday, April 23rd, 2007

Sorry for the lack of posting. You’ll notice a few posts that have stacked up over the last few weeks. I had a few technical difficulties that I had to recover from, and with a full-time job and 3 kids at home, I don’t have a lot of time for diagnosing tech problems.

I recently had decided to play with Ruby and Rails, so I upgraded a few packages on my FreeBSD system to be able to deploy some toy applications I was writing as a learning exercise. Unfortunately, at some point along the way I broke something, and WordPress started giving me “Fatal error: Call to undefined function: xml_parser_create()”. I tried using Flock to post, and it gave me the same problem, although I had to turn on error logging and look at “blog.log” to see it.

I started doing some digging, and found that xml_parser_create() is part of the XML extension to PHP, so I tried rebuilding that – no dice. Then I updated all my ports, and tried again. Now it won’t even compile. The error is:

libtool: ltconfig version `' does not match ltmain.sh version `1.3.5-freebsd-ports'

Hmm. Google didn’t help on this one. I tried rebuilding PHP 4.4.6, but that failed as well. Googling that problem again didn’t help. Hmm.

After still more digging, I figured out that I’m running FreeBSD 4.1, which has recently been End-of-Lifed, so the ports are no longer guaranteed to work. Arrgh! Now somewhere I’m going to have to find time to upgrade to FreeBSD 6.

I still needed to fix the problem, so I tried using the packages system and installing some old versions of the packages. That totally screwed up everything, so I quickly backed that out.

I went back to the beginning and turned on PHP error logging. When I restarted Apache, I got this in the error log:

PHP Warning: Unknown(): Unable to load dynamic library '/usr/local/lib/php/20020429/xml.so' - Shared object "libexpat.so.5" not found in Unknown on line 0

Now I’m getting somewhere – the library is just not found. I found libexpat.so.6, but not 5. After rebuilding libexpat, I found that the version 2 I have, which I must have upgraded to as part of the Ruby/Rails install, is the .6 library version. I couldn’t easily find the .5 version, but Google pointed me to a great idea. Just symlink the .6 version to .5 to fool PHP. Bingo! It works!

System administration can be a pain – I admire those who do it for a living. I do it once in a while, just so I remember how painful it is sometimes. Now, to post a few entries that have stacked up.

Paying taxes to Microsoft

Tuesday, April 17th, 2007

[I started writing this at the Microsoft VC Summit a few weeks ago, but tax day spurred me to complete it.]

Other than hardcore libertarians, I think there are few who would debate that government funding of fundmental research is a good thing. Much of this research is done at top universities funded by government grantd, but there are also institutions like DARPA, NASA, and NIH that are directly funded. Most corporations, with their focus on quarterly earnings, have too short a timeline to spend significant amounts of money on research that doesn’t have an obvious return on investment in a relatively short time frame.

There have always been a few exceptions, and what is interesting is what they seem to have in common. For example, Bell Labs springs to mind as a great exception. They produced literally thousands of innovations, most of which were (at the time) commercially unusable. Another classic example is Xerox PARC. Once again, tremendous innovative and fundamental research, with little commercial application. What is interesting about both these companies is they were essentially monopolies, and highly profitable, such that their products were referred to as “taxes”. Today we see companies like Microsoft and Google engaged in similar research efforts (although Google’s is pretty young still).

I heard Steve Ballmer speak the other day, and he boasted several times about Microsoft earning $20 billion last year. Many who were with me groaned about the egregious “Microsoft tax” and expounded on how much better the industry would be if everyone used Linux and OpenOffice and the $20 billion were returned to the users.

This prompted a spirited discussion at lunch, which included some long-time Microsoft execs. The “Microsoft tax” is pretty small for each individual. Which creates the greater good: giving each computer user a small amount of money back to spend as they wish, or allowing Microsoft to engage in fundamental research that may improve the lives of everyone? Viewed this way, it looks much the same as increasing the income tax 0.01% to pay for NASA. Of course, this only applies to the $1 billion or so that Microsoft spends on research. The other $19 billion that is dividended back to shareholders is more like a reverse Robin Hood – take from computer users, give to MSFT stockholders.

Now, there are plenty of egregiously profitable companies – Exxon Mobil for example – that don’t spend nearly as much on blue-sky research as they could (despite their marketing that says they do). Perhaps instead of legislating lower profits for them, the federal government should consider legislating more pure research? Of course, this would be hard to verify, but it would be a start.

Slingbox Success Story

Thursday, February 1st, 2007

A friend of mine is from New Orleans, and is a huge Saints fan. The weekend they were in the championship game, we were headed to Tahoe with our families, and staying in a house with no TV reception. He threatened to leave early, to get home in time for the game. What to do?

We didn’t even have a broadband internet connection. But we did have a decent cellphone signal, and I have an EVDO phone that I can connect to my laptop. And I have a Slingbox at home. Aha!

Challenge number one: the cell tower wasn’t upgraded to EVDO, so we were stuck with 1xRTT which is much slower. I was seeing about 100Kbps sustained throughput. Fortunately, Slingbox is very good at optimizing available bandwidth, so the picture was okay, although a bit blurry (hard to read the time remaining) and blocky when moving fast. Sound was great, which helps a lot.

Challenge number two: nobody wants to watch the game on the computer, we want to watch on the big screen. Fortunately I have a Mac, which has TV-out. I also carry cables to hook the Mac into the stereo as well as the TV, so we had good sound and good picture.

Challenge number three: my home cable service was inadvertently cut the day we headed up there. So, Slingbox working great, but no TV signal at all. This was stumper. I scoured the internet for other streaming services that could get us the game, but aside from a few questionable-looking sites, I couldn’t find anything.

When all else fails, try asking someone else. I posted a question on LinkedIn Answers. Within an hour, I got two folks from my network offering to let me connect to their Slingboxes. Problem solved!

I connected to a Slingbox that happened to be located in Atlanta. He had a Comcast DVR hooked up, so we could even rewind and fast forward if we wanted, although it was a bit slow to respond. The whole thing worked almost flawlessly, and we forgot at times that we were watching over a pretty slow network connection.

My friend was ecstatic to watch the game, although his ecstasy soon turned to agony. We will be up there again Superbowl weekend, so we’ll have a chance to try it again, although this time it should be our own Slingbox. Now to hook the Slingbox up to the Media Center PC, so I can get my recorded shows…