Wednesday, December 27, 2006

Skype 3.0.0.190 is now in general release for Windows, amongst all the changes mentioned in the Skype Garage post is this little gem:

change: API: application can connect to oneself


This adresses an interesting issue. Unlike most network addressing schemes, skype connects to a username, and there is nothing to stop a user running skype on many machines as the same user. When app2app messaging connects to a user, you get an array of streams that connect you to all the endpoints for that user.

However, in the past you could not make a connection to yourself, and now you can. So if you connect to your own username you will get back an array of streams to your other instances. This could become quite useful for keeping all kinds of things synchronized across multiple machines.

Saturday, November 25, 2006

Processing vxstat to read into R

I got bored with my iostat data, and found some interesting looking vxstat logs to browse with the Cockcroft Headroom Plot. To get them into a regular format I wrote a short Awk script that is shown below. It skips the first record, adds a custom header and drops the time field into the first column.


# process vxstat file into regular csv format
BEGIN { skipping=1; printf("time,vol,reads,writes,breads,bwrites,tread,twrite\n"); }
NR < 4 {next} # skip header
NF > 0 && skipping==1 {next} # skip first record of totals since boot
NF == 0 {skipping=0}
NF == 5 {time=$0}
NF == 8 {printf("%s,%s,%s,%s,%s,%s,%s,%s\n",time,$2,$3,$4,$5,$6,$7,$8);}


It turns a file that looks like this:

OPERATIONS BLOCKS AVG TIME(ms)
TYP NAME READ WRITE READ WRITE READ WRITE

Mon May 01 19:00:01 2000
vol home 88159 346799 17990732 3680604 13.7 15.6
vol local 64308 103869 3848746 410899 6.0 22.0
vol orahome 80240 208372 18931823 886870 11.9 21.1
vol rootvol 336544 537741 21325442 8566302 4.8 323.1
vol swapvol 32857 339 4199304 58160 13.8 22.5
vol usr 396221 174834 11766646 2872832 3.5 547.6
vol var 316340 1688518 25138480 19275428 11.1 53.7

Mon May 01 19:00:31 2000
vol home 1 28 4 129 10.0 34.3
vol local 0 2 0 8 0.0 330.0
vol orahome 4 20 24 88 10.0 84.0
vol rootvol 0 80 0 720 0.0 9.4
vol swapvol 0 0 0 0 0.0 0.0
vol usr 0 1 0 16 0.0 20.0
vol var 4 235 54 2498 15.0 13.7

... and so on


into

% awk -f vx.awk < vxstat.out
time,vol,reads,writes,breads,bwrites,tread,twrite
Mon May 01 19:00:31 2000,home,1,28,4,129,10.0,34.3
Mon May 01 19:00:31 2000,local,0,2,0,8,0.0,330.0
Mon May 01 19:00:31 2000,orahome,4,20,24,88,10.0,84.0
Mon May 01 19:00:31 2000,rootvol,0,80,0,720,0.0,9.4
Mon May 01 19:00:31 2000,swapvol,0,0,0,0,0.0,0.0
Mon May 01 19:00:31 2000,usr,0,1,0,16,0.0,20.0
Mon May 01 19:00:31 2000,var,4,235,54,2498,15.0,13.7
... and so on


This can easily be read into R and plotted using


> vx <- read.csv("~/vxstat.csv", header=T)
> vxhome <- vx[vx$vol=="home",]
> chp(vxhome$reads,vxhome$treads)


One of the files I tried was quite long, half a million lines. It loaded into R in fifteen seconds, and the subsequent analysis operations didn't take too long. Try that with a spreadsheet... :-)

Slingbox for Xmas

What new toys can we get this Xmas? I already have the stuff I need. I'd like a phone with Wifi and 3G network speeds and a touch screen, but my Treo 650 is OK until something better comes along. I'm curious to see what Apple may come up with next year, in the much rumoured iPhone.

I've had a Tivo since 1999, and I'd like to be able to view the programs elsewhere in the house or further afield. The Slingbox does this, lets me control the Tivo remotely and stream the programs to a Windows or OSX laptop. The Slingbox AV was $179 list price on their web site, but I had a look on shopping.com and found it for sale from an out of state vendor for $140 with free shipping and no tax. So that's going to be the new toy this Xmas....

Thursday, November 23, 2006

Cockcroft Headroom Plot - Part 3 - Histogram Fixes

I found that I had some scaling issues with the histograms that needed fixing. Ultimately this made the code look a lot more complex, but it now deals with scaling the plot and the histogram with a fixed zero origin on both axes. I think its important to maintain the zero origin for a throughput vs. response time plot.

The tricky part is that the main plot is automatically oversized from its data range by a few percent, and the units used in the histogram are completely different. A histogram with 6 bars is scaled to have the bars at unit intervals and is 6 wide plus the width of the bars etc. After lots of trial and error, I made the main plot use the maximum bucket size of the histogram as its max value, and artificially offset the histograms by what looks like about the right amount. The plot below uses fixed data as a test. You can see that the first bar includes two points, thats due to the particular algorithm used by R. Some alternative histogram algorithms are available, but this one seems to be most appropriate to throughput/response time data.

> chp(5:10,5:10)



The updated code follows.

chp <- function(x,y,xl="Throughput",yl="Response",tl="Throughput Over Time",
ml="Cockcroft Headroom Plot") {
xhist <- hist(x,plot=FALSE)
yhist <- hist(y, plot=FALSE)
xbf <- xhist$breaks[1] # first
ybf <- yhist$breaks[1] # first
xbl <- xhist$breaks[length(xhist$breaks)] # last
ybl <- yhist$breaks[length(yhist$breaks)] # last
xcl <- length(xhist$counts) # count length
ycl <- length(yhist$counts) # count length
xrange <- c(0,xbl)
yrange <- c(0,ybl)
nf <- layout(matrix(c(2,4,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
layout.show(nf)
par(mar=c(5,4,0,0))
plot(x, y, xlim=xrange, ylim=yrange, xlab=xl, ylab=yl)
par(mar=c(0,4,3,0))
barplot(xhist$counts, axes=FALSE,
xlim=c(xcl*0.03-xbf/((xbl-xbf)/(xcl-0.5)),xcl*0.97),
ylim=c(0, max(xhist$counts)), space=0, main=ml)
par(mar=c(5,0,0,1))
barplot(yhist$counts, axes=FALSE, xlim=c(0,max(yhist$counts)),
ylim=c(ycl*0.03-ybf/((ybl-ybf)/(ycl-0.5)),ycl*0.97),
space=0, horiz=TRUE)
par(mar=c(2.5,1.7,3,1))
plot(x, main=tl, cex.axis=0.8, cex.main=0.8, type="S")
}

Monday, November 20, 2006

Cockcroft Headroom Plot - Part 2 - R Version

I kept tweaking the code, and came up with a prettier version, that also has a small time series view of the throughput in the top right corner.



The code for this is

chp <- function(x,y,xl="Throughput",yl="Response",tl="Throughput Time Series", ml="Cockcroft Headroom Plot") {
xhist <- hist(x,plot=FALSE)
yhist <- hist(y, plot=FALSE)
xrange <- c(0,max(x))
yrange <- c(0,max(y))
nf <- layout(matrix(c(2,4,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
layout.show(nf)
par(mar=c(5,4,0,0))
plot(x, y, xlim=xrange, ylim=yrange, xlab=xl, ylab=yl)
par(mar=c(0,4,3,0))
barplot(xhist$counts, axes=FALSE, ylim=c(0, max(xhist$counts)), space=0, main=ml)
par(mar=c(5,0,0,1))
barplot(yhist$counts, axes=FALSE, xlim=c(0,max(yhist$counts)), space=0, horiz=TRUE)
par(mar=c(2.5,1.5,3,1))
plot(x, main=tl, cex.axis=0.8, cex.main=0.8, type="S")
}


I also made a wrapper function that steps through the data over time in chunks.

> chp.step <- function(x, y, steps=10, secs=1.0) {
xl <- length(x)
step <- xl/steps
for(n in 0:(steps-1)) {
Sys.sleep(secs)
chp(x[(1+n*step):min((n+1)*step,xl)],y[(1+n*step):min((n+1)*step,xl)])
}
}


To run this smoothly on windows, I had to disable double buffering using

> options("windowsBuffered"=FALSE)

and close the graphics window so that a new one opens with the new option.

The data is displayed using the same calls as described in Part 1. The next step is to try some different data sets and work on detecting saturation automatically.

Sunday, November 19, 2006

The Cockcroft Headroom Plot - Part 1 - Introducing R

I've recently written a paper for CMG06 called "Utilization is Virtually Useless as a Metric!". Regular readers of this blog will recognize much of the content in that paper. The follow-on question is what to use instead? The answer I have is to plot response time vs. throughput, and I've been thinking about a very specific way to display this kind of plot. Since I'm feeling quite opinionated about this I'm going to call it a "Cockcroft Headroom Plot" and I'm going to try and construct it using various tools. I will blog my way through the development of this, and I welcome advice and comments along the way.

The starting point is a dataset to work with, and I found an old iostat log file that recorded a fairly busy disk at 15 minute intervals over a few days. This gives me 250 data points, which I fed into the R stats package to look at. I'll also have a go at making a spreadsheet version.

The iostat data file starts like this:
                    extended device statistics              
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
14.8 78.4 183.0 2446.3 1.7 0.6 18.6 6.6 1 21 c1t5d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.0 0 0 c0t6d0
...

I want the second line as a header, so save it (my command line is actually on OSX, but could be Solaris, Linux or Cygwin on Windows)
% head -2 iostat.txt | tail -1 > header

I want the c1t5d0 disk, but don't want the first line, since its the average since boot, and want to add back the header
% grep c1t5d0 iostat.txt | tail +2 > tailer
% cat header tailer > c1t5.txt

Now I can import into R as a space delimited file with a header line. R doesn't allow "/" or "%" in names, so it rewrites the header to use dots instead. R is a script based tool with a command line and a very powerful vector/object based syntax. A "data frame" is a table of data object like a sheet in a spreadsheet, it has names for the rows and columns, and can be indexed.
> c1t5 <- read.delim("c1t5.txt",header=T,sep="")
> names(c1t5)
[1] "r.s" "w.s" "kr.s" "kw.s" "wait" "actv" "wsvc_t" "asvc_t" "X.w" "X.b" "device"

I only want to work with the first 250 data points so I subset the data frame by indexing the rows with an array (1:250) that selects the rows I want and leaving the column selector blank.
> io250 <- c1t5[1:250,]

The first thing to do is summarize the data, the output is too wide for the blog so I'll do it in chunks by selecting columns.

> summary(io250[,1:4])
r.s w.s kr.s kw.s
Min. : 1.80 Min. : 1.8 Min. : 13.5 Min. : 38.5
1st Qu.: 10.30 1st Qu.: 87.1 1st Qu.: 107.4 1st Qu.: 2191.7
Median : 18.90 Median :172.4 Median : 182.8 Median : 4279.4
Mean : 22.85 Mean :187.5 Mean : 290.1 Mean : 4448.5
3rd Qu.: 28.88 3rd Qu.:274.6 3rd Qu.: 287.4 3rd Qu.: 6746.6
Max. :130.90 Max. :508.8 Max. :4232.3 Max. :13713.1
> summary(io250[,5:8])
wait actv wsvc_t asvc_t
Min. : 0.000 Min. :0.0000 Min. : 0.000 Min. : 1.000
1st Qu.: 0.000 1st Qu.:0.3250 1st Qu.: 0.400 1st Qu.: 3.125
Median : 0.600 Median :0.8000 Median : 2.550 Median : 4.700
Mean : 1.048 Mean :0.9604 Mean : 5.152 Mean : 4.634
3rd Qu.: 1.300 3rd Qu.:1.5000 3rd Qu.: 6.350 3rd Qu.: 5.700
Max. :10.600 Max. :3.5000 Max. :88.900 Max. :15.100
> summary(io250[,9:10])
X.w X.b
Min. :0.000 Min. : 2.00
1st Qu.:0.000 1st Qu.:20.00
Median :1.000 Median :39.50
Mean :1.428 Mean :37.89
3rd Qu.:2.000 3rd Qu.:55.00
Max. :9.000 Max. :92.00


Looks like a nice busy disk, so lets plot everything against everything (pch=20 sets a solid dot plotting character)
> plot(io250[,1:10],pch=20)
The throughput is either reads+writes or KB read+KB written, the response time is wsvc_t+asvc_t since iostat records time taken waiting to send to a disk as well as time spent actively waiting for a disk.

To save typing, I attach to the data frame so that the names are recognized directly.
> attach(io250)
> plot(r.s+w.s, wsvc_t+asvc_t)
This looks a bit scattered, because there is a mixture of average I/O sizes that varies during the time period. Lets look at throughput in KB/s instead.
> plot(kr.s+kw.s,wsvc_t+asvc_t)
That looks promising, but its not clear what the distribution of throughput is over the range. We can look at this using a histogram.
> hist(kr.s+kw.s)

We can also look at the distribution of response times.
> hist(wsvc_t+asvc_t)
The starting point for the thing that I want to call a "Cockcroft Headroom Plot" is all three of these plots superimposed on each other. This means rotating the response time plot 90 degrees so that its axis lines up with the main plot. After looking around in the manual pages I eventually found an example that I could use as the basis for my plot. It needs some more cosmetic work but I defined a new function chp(throughput, response) shown below.

> chp <- function(x,y,xl="Throughput",yl="Response",ml="Cockcroft Headroom Plot") {
xhist <- hist(x,plot=FALSE)
yhist <- hist(y, plot=FALSE)
xrange <- c(0,max(x))
yrange <- c(0,max(y))
nf <- layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)
layout.show(nf)
par(mar=c(3,3,1.5,1.5))
plot(x, y, xlim=xrange, ylim=yrange, main=xl) par(mar=c(0,3,3,1))
barplot(xhist$counts, axes=FALSE, ylim=c(0, max(xhist$counts)), space=0, main=ml)
par(mar=c(3,0,1,1))
barplot(yhist$counts, axes=FALSE, xlim=c(0, max(yhist$counts)), space=0, main=yl, horiz=TRUE)
}

The result of running chp(kr.s+kw.s,wsvc_t+asvc_t)is close...


That's enough to get started.

ps3 Marketplace Research on eBay


Over at Data Mining there is some interesting info on ps3's.

However, there is no need to do manual scraping of
eBay, here is a screenshot from the marketplace research function
that is bundled with my eBay store subscription. For $2.99 for 2 days
access anyone can get at this.

http://pages.ebay.com/marketplace_research/

Skype on Solaris

http://blogs.sun.com/darren/entry/skype_1.3.0.53_on_solaris_via

Solaris has a Linux compatible subsystem called BrandZ for running Linux binaries that don't have Solaris builds (like Skype). Darren figured out how to get the Linux build of Skype to run on Opensolaris.

Thanks to Alec for pointing this out.


Saturday, November 11, 2006

10 Things to Know About Skype Ap2Ap Programming

I also posted this on the Skype Developer Wiki

The ap2ap capability is an interesting new network computing paradigm but it is not like a conventional network.
  1. end nodes are addressed by skype name, which addresses a person, not a computer

  2. people can login to skype multiple times, so addressable endpoints are not unique

  3. skype can go online/offline at will, so there is a concept of "presence" that needs to be managed

  4. you can only make ap2ap connections to your buddy list or people who you have chatted to "recently"

  5. both ends of an ap2ap connection have to choose a unique string used to identify their conversation or protocol

  6. if you quit and restart skype, the first login can persist for a while, so you can get multiple ap2ap connections from a single user, although the ghosts of your previous connections cannot respond to a message. I think is is because you connect to a different supernode each time, and the first one isn't sure if you have really gone away yet

  7. messages have to be sent as text, so binary objects have to be converted first using something like base64

  8. the network can behave differently each time you use it, and this non-determinism makes testing difficult

  9. relayed connections are limited to about 3KB/s, direct ones can run at several MB/s over a LAN

  10. Skype4Java is cross-platform, but the maximum message size is about 64KB on windows and 16KB on OSX/Linux, and there are several bugs and limitations in the older version of the API library that is used by Skype 2.0 and earlier releases. Use Skype 2.5 or later for the best performance and stability

Monday, October 30, 2006

Bloglines, OPML, Blogger and Flock

I aggregate 50 or so blog feeds using Bloglines, its a very useful way to keep track of infrequent blogs in particular, and it strips off the adverts and other decoration from the power bloggers.

I just did some tidying up of my blog list, exported it in OPML format and uploaded it to http://share.opml.org/

This is an interesting way to contribute to the "Top 100" blogs list on that site, and it also has some useful features like seeing who else reads the same blogs, and who has the most similar list of blogs.

I seem to have added to the "long tail" since many of the blogs I read were new to the site, so I'm the only reader. My blog had one other reader "Ian" (hello!) who has a very long list of blogs, that I may poke around in if I get some free time...

Meanwhile, Flock is working well as my cross-platform browser. I'm writing blog entries with it, although I did lose an entry I was writing a week or so ago, when I upgraded Flock and didn't save an almost complete entry first. The new version of Flock appears to automatically save entries every few minutes, but I hate re-writing things, so that entry may not be re-created for a while.

Flock seems to have some issues writing entries to blogger.com at the moment. I'm not using the updated blogger.com, but Flock fails to write the blog entry most of the time, then randomly works. I gave up and used cut and paste to post this entry directly....

Pandora Prog Channel

I've been trying Pandora on and off for Internet music for a while, and attended a talk by CEO Tim Westergren last week, which got me to try it again. They are continuously improving their algorithms for choosing music, and I was trying to make a channel that would serve me interesting new music alongside some of my favourite experimental "Prog Rock" bands. It seems to be working much better, and I keep tuning the channel by skipping tracks that I don't like and giving thumbs up to the ones I do. The nice thing is you can listen to my channel, even though you don't get exactly the same songs as I do, there should be an interesting mix of King Crimson, Zappa, Estradasphere, and many other bands playing music you won't hear often. Its easy to make your own channel (it takes less training to make a more mainstream channel) and its the best way I've found to discover completely new music.

http://www.pandora.com/?sc=sh59488715848528731

Enjoy...

Blogged with Flock

Sunday, October 08, 2006

CMG06 Conference - Reno December 3-8

As usual I'll be attending the Computer Measurement Group conference in Reno Nevada this December. I've attended every year since 1994, and its the place where I get an update on the state of the art in Performance Management, and get to mingle with my friends and peers who work on Capacity Planning.

This year I'm presenting three times:

  1. Sunday morning 3 hour seminar on Capacity Planning with Free and Bundled Tools. This is a repeat of last years talk, presented jointly with Mario Jauvin, who covers the Windows OS and Networking related areas. I cover Solaris, Linux and the system oriented tools.
  2. Wednesday morning conference paper titled "Utlization is Virtually useless as a Metric!". Regular readers of this blog will recognize much of the content of this paper, which gathers together all the ways in which your measurements can be corrupted by virtualization.
  3. Thursday morning I'm giving a 3 hour training course called the Unix/Linux CMG Quick Start Course, which is part of a new feature for CMG and is based on the training classes in performance tuning that I have given for many years.
Early bird discounted registration is open until October 13th. The sunday seminars are an extra cost item, but the Thursday morning training classes are included in the regular conference fee. This is the only place I'm planning to give public training classes, and since I'm at the conference all week its a great opportunity to discuss performance and capacity issues in person. I hope to see you there...

technorati tags:, , , , , ,

Blogged with Flock

Monday, September 04, 2006

Updated Ad Setup

I just changed to the flash based ad sidebar. It lets you pick different keywords without reloading the page by clicking on the top tab.

I also continued to focus on Technology Books, picking some specific categories to try and exclude the certification and training books that I find less interesting. I then excluded Microsoft and mcse as keywords, and ended up with Cisco certification so I excluded them as well, and it looks like a more reasonable selection now.

Blogged with Flock

Sunday, September 03, 2006

Comments on Web Traffic

This blog gets about 50 visitors a day, and most new visitors arrive as the result of a google search for my name, Thumper/ZFS or the SE toolkit. There is a very small number of yahoo and msn searches. The other main source of traffic is the Sun Community blogging site, which links to Sun Alumni's blogs including this one. A few weeks ago I got a lot of traffic from Sun, and I think I traced it back to a blog entry from Jonathan Schwartz, who talked about his General Counsel's blog, and Mike Dillon the GC mentioned the Sun Community blog site. This doubled my traffic for a week or two.

The other recent change is that I stopped showing adverts from ctxbay, which was created by an eBay developer as a side project, won a prize, but never really worked well enough to be useful. I've replaced them with the official eBay in-house AdContext system, which is being beta tested. AdContext looks at your page content, and matches keywords it with popular items from eBay. It can be configured to exclude certain keywords (in my case I exclude "Adrian", which was causing problems for ctxbay), and you can choose certain categories or stores to pick items from. I've picked Technology Books and some Storage hardware categories. I'm going to experiment with different formats and constraints, to see how well it works.

There are three formats, text only, pictures (which I started with) and flash (which scrolls multiple adds into the same amount of screen space). I'll switch it to flash when I get around to it...


technorati tags:, ,

Blogged with Flock

Friday, August 18, 2006

Web vs. Skype, a paradigm shift

The essential characteristics of the http based web are that by default everyone is anonymous, and everyone can get to everything. Its "free" and the trend is for the parts that are not free and anonymous to move in that direction. For example, you can now buy stuff on eBay Express without having to sign up for an eBay account, and there are fewer newspaper sites requiring paid subscriptions, since they are losing audience to the free sites.

However, the essential characteristics of the Skype peer to peer network are the opposite of the Internet. Everyone has a clear identity and no-one can get to anything without asking for permission or being invited. I think this truly a different paradigm for building systems.

Everyone on Skype is plugged into the public key encryption infrastructure (PKI) which provides a secure identity as well as secure communications between peers. However, to communicate with other peers you need to know them and have permission. For me the most interesting capability on Skype is the application to application messaging API (ap2ap) that enables a new class of distributed applications that leverage the social network formed by the mesh of Skype contact lists.

The upshot of this is that some things that are easy on the Internet are difficult on Skype, and vice versa. There is a temptation to take something that we know works on the web, and try to make something similar on Skype ap2ap, but that is pointless, just use the web! Look for things that really don't work well on the web, or look for web-based systems that connect a few people but need an expensive back-end or don't scale. This is the start of something interesting....

technorati tags:, ,

Blogged with Flock

Sunday, August 06, 2006

Solaris Internals and Performance 2nd Edition

Richard and Jim have finally finished their updated book and got it published. Rush out and buy a copy! I just listened to a podcast where they talked about it and mentioned that you can get it for 30% off from http://www.sun.com/books. Strangely, my own Sun Performance and Tuning book isn't listed there, although my BluePrint books on Capacity Planning and Resource Management are.

I was also amused to see that in the slide deck they use to launch the book they reference Adrian's Rule of book writing (book size grows faster than you can write).

Congratulations!

Now do I have time to start another book? I'm not sure... maybe.

technorati tags:,

Blogged with Flock

Sun ZFS and Thumper (x4500)

I was one of the beta testers for Sun's new x4500 high density storage server, and it turned out pretty well. I was able to hire Dave Fisk as a consultant to help me do the detailed evaluation using his in-depth tools, and it turned into a fascinating investigation of the detailed behavior of the ZFS file system.

ZFS is simple to use, has lots of extremely useful features, and the price is right (bundled with Solaris 10 6/06 or OpenSolaris). However its doing lots of clever things under the hood and it behaves like nothing else. Its far more complicated to predict its performance than any other file system we've looked at. It even baffled Dave at first, he had to change his tools to support ZFS, but he's got it pretty well figured out now.

For a start, its a write anyware file system layout (WAFL) which is similar in some ways to a NetApp filer. This means that random writes are batched up, sorted by file, file system etc. and every few seconds a big burst of sequential writes commits the data to disk as a transaction. Since sequential writes to disk are always much more efficient than random writes, this mean that it gets much more performance per disk than UFS/VxFS etc for random writes.

The combination of the x4500 and ZFS works well, since ZFS knows that the firmware on the 48 SATA drives in the x4500 have a write cache that can safely be enabled and flushed on demand. This greatly improves performance and fixes an issue that I have been complaining about for years. Finally a safe way to use the write caches that exist in every modern drive.

Its actually easier to list the things that ZFS on the x4500 doesn't have.

  • No extra cost - its bundled in a free OS
  • No volume manager - its built in
  • No space management - file systems use a common pool
  • No long wait for newfs to finish - we created a 3TB file system in a second
  • No fsck - its transactional commit means its consistent on disk
  • No rsync - snapshots can be differenced and replicated remotely
  • No silent data corruption - all data is checksummed as it is read
  • No bad archives - all the data in the file system is scrubbed regularly
  • No penalty for software RAID - RAID-Z has a clever optimization
  • No downtime - mirroring, RAID-Z and hot spares
  • No immediate maintenance - double parity disks if you need them
  • No hardware failures in our testing - we didn't get to try out some of these features!

and finally, on the downside

  • No way to know how much performance headroom you have
  • No way to get at the disks without taking the top off the x4500
  • No clustering support - I guess they couldn't put everything on the wish list...

The performance is actually very good, and in normal use its going to be fine, but when we tried to drive ZFS to its limit, we found that the results were less consistent or predictable than more conventional file systems. Some of the issues we ran into are present in the Solaris 10 6/06 release, but when the x4500 ships it will have an update to ZFS that includes performance fixes to speed things up in general and reduce the impact of the worst case issues, so it should be more consistent.

We've put ZFS on some of our internal file servers, to see how it goes in light usage. However, it always takes a while to build up confidence in a large body of new code, especially if its storage related. If we can add this one to the list:

  • No nasty bugs or surprises?

Then ZFS looks like a good way to take a lot of cost out of the storage tier.

I'm interested to hear how other people are getting on with ZFS, especially mission critical production uses.

technorati tags:, , , ,

Blogged with Flock

Monday, July 24, 2006

IEEE Conference Paper

I attended the IEEE E-Commerce conference in San Franscisco. The conference is known as CEC06/EEE06 and some other acronyms. It was a very interesting academic oriented event with a few hundred people from all over the world, I made some good contacts and learned some new stuff.

My own paper was about how I built a large scale simulation of a peer to peer network using a very efficient architecture based on the Occam language. I used to write a lot of Occam about 20 years ago, and it seemed appropriate to the problem I wanted to solve. I think most people are baffled by the language, but I like it. Unlike most recent languages where everything is an object with types and methods, in Occam everything is a process with protocols and messages. The other difference is that Occam was designed to run fast on a 10MHz CPU and on todays CPUs it is extremely fast and small compared to recent languages like Java.

What I found at the conference was that most of the simulation frameworks people were using were run overnight to generate results. My own example simulation of 1000 nodes ran for about three seconds to produce an interesting result.

The full paper can be obtained from http://doi.ieeecomputersociety.org/10.1109/CEC-EEE.2006.81

This is the official URL, and IEEE charges non-members for downloads.


technorati tags:, ,

Blogged with Flock

Tuesday, June 20, 2006

CPU Power Management

AMD PowerNow! for the Opteron series of server CPUs dynamically manages the CPU clock speed based on Utilization. The speed takes a few milliseconds to change, and it is not clear exactly what speeds are supported, but one report stated that the normal speed of 2.6GHz would reduce to as low as 1.2GHz under a light load. This report also shows CPU detailed configuration and power savings. http://www.gamepc.com/labs/view_content.asp?id=opteron285&page=3

The problem with this for capacity management is that there is no indication of the average clock rate in the standard system metrics collected by capacity planning tools. PowerNow! is described by AMD at http://www.amd.com/us-en/0,,3715_12353,00.html and drivers for Linux and Windows are available from http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_871_9033,00.html. In the future, operating systems may be able to take the current speed into account, and estimate the capability utilization, but the service time is higher at low clock rates, so we will still see some confusing metrics.

The current PowerNow! implementation works on a per-chip basis, and Opteron’s have two complete CPU cores per chip that share a common clock rate. In a multiprocessor system made up of several chips, each pair of processing cores could be running at a different speed, and their speed can change several times a second.

Our basic assumption for well behaved workloads, that mean service time is a constant quantity, is invalidated in a very non-linear manner, and utilization measurements will also move in mysterious ways....

Monday, June 12, 2006

eBay AdContext Contextual Adverts

I've been running contextual ads on this site for a few months, using the CTXbay service that won an eBay developers program award. I don't think I've had many people click through, since the ads aren't very relevant, and seem fixated on the word "Adrian".

Now that eBay has announced its own AdContext service is on the way, I'm planning to replace CTXbay with the official service as soon as I can get access to it. The eBay service has access to a lot more information and is quite customizable. I asked about it at the Developers Conference and was told that I could set a default category and control what kind of items appear.

eBay Wireless WAP Access | by Adrian Cockcroft | June 12th, 2006

I was at the eBay developers conference showing some proof of concept prototypes of new mobile applications, and I found that hardly anyone knew that eBay already has a WAP based mobile version of the site. It loads in seconds on any phone that has any kind of web browser, but there is no automatic redirect from the main eBay site. You should bookmark this on your phone's web browser:

http://wap.ebay.com

You can also use this site to backend other mobile applications. If your own code helps a user find an item on eBay then you can form a URL that contains the item id and go directly into the official eBay WAP based site. It handles user login, MyEbay, watchlists etc. The main problem with the wap site is that the search functionality is too simplistic. The prototypes we were showing (no, you can't access them, you should have been there...) are aimed at fixing the finding experience.

Tuesday, June 06, 2006

Part 3: Disruptive Innovation viewed as a Maturity Model | by Adrian Cockcroft | June 6th, 2006

This time I'll take a more abstract view of a maturing market as each phase evolves, and refer to the development of in-home movie watching as an example.

  1. An emerging market is characterized by competition on the basis of technology. Early adopters like to play with new technology and are able to cope with its issues. Many different products are competing for market share on the basis of "my features are better". Think of the early days of the VCR, with VHS vs. Betamax. In a mature market, few people worry about features, most VCR or DVD players have the same feature set and very good picture quality at a very low price. If you want to be sure you get a good one, you are most likely to buy using brand name (e.g. Sony) rather than poring over detailed specifications. Margins are low, but volume is high and margins can be better if you won the brand battle.
  2. The next phase in the market is characterized by competition on the basis of service. Think of the video rental store as a service. You visit the store and pay rental according to how much you use the service. As an emerging service, anyone could setup to rent videos and DVDs. As the market matured, larger stores with a bigger selection and more centralized buying power provided a better service, and video rental chains such as Blockbuster took over the market. Again, the power of a dominant brand became the primary differentiator as the service market matured.
  3. The third phase in the market is the evolution of a service into a utility. A utility provides a more centralized set of resources, and a regular subscription or monthly bill. It can provide similar services, but in a more automated manner. NetFlix is my example of a utility based DVD provider service. You pay a monthly fee which encourages steady consumption, and NetFlix have automated the recommendation system, which replaces asking the counter clerk in a video rental store for advice. The recommendations are the result of many peoples opinions, so are likely to be less biased and better informed, but the most important difference in the utility approach is that it doesn't need people to provide the service directly to the customer. This makes it fundamentally cheaper. Many traditional services were transformed into utilities by the arrival of the Internet, which allows consumers to access information based utilities in a generic and efficient manner. The network effect benefit of having a large user base also causes dominant brand names to emerge. NetFlix leads mindshare in this space, despite attempts by BlockBuster to copy their business model, NetFlix can grow faster with fewer people as a pure utility.
  4. The final phase in the evolution of a market occurs as the cost of replication and distribution of the product approaches zero. For digital content the end customer already has a computer and an Internet connection. There is no additional cost to use it to download a movie. A central utility such as YouTube can use a mixture of advertising and premium services (for a minority of power users) to offset their own costs. Peer to peer systems distribute the load so that there is no central site and no incremental cost in the system. The only service that is needed is some kind of search, so that peers can find each other's content to exchange it. PirateBay is primarily a search engine, and search engines become dominant when the brand gets well known, and they find what you are looking for because they have a comprehensive index.
So the evolution of a marketplace goes from competing on the basis of technology, to competing on service, to competing as a utility, to competing for free. In each step of the evolution, competitors shake out over time and a dominant brand emerges.

To use this as a maturity model, take a market and figure out whether the primary competition is on the basis of technology, service, utility or search, and consider whether a dominant brand has emerged in that phase. The model should then indicate what the next step is likely to be, so you can try to find the right disruptive innovation to get you there. Good luck!

Saturday, June 03, 2006

Part 2: Moving Pictures - disruptive innovation from the Cinema to PirateBay | by Adrian Cockcroft | June 3rd, 2006

Lets look at the history of movies. The initial technology to capture and replay moving pictures was developed around 100 years ago, and the initial competition between inventors went through its first transition when movie theaters became established and began to settle on a standard form of projector. The inventors who had alternative camera/recording/projector technology died out. Consumers wanted to go see movies and the movie industry formed to provide content for that market.

The next innovation was to be able to watch movies at home on film, then there were movies on TV. The movie theaters had far bigger screens, better sound and color but the technology at home gradually caught up in features and reduced in cost, and a market transition to home viewing occurred. The total market size for equipment bought to watch movies at home is huge. Its important to note that the primary vendors in each phase of the market are different. The movie theater business is very different to the home video equipment supplier business. The early battles in the home were over the standard formats, famously Betamax failed to win over VHS for video tape, and there are continuing battles over DVD formats, but Sony is a dominant brand name in a crowded market for home video equipment.

The next innovation was video rental, and Blockbuster ended up as a major player in this market, with presence on every high street. However, that presence became unnecessary as Netflix shipped DVD's directly to consumers and took over a large share of the market.

Finally, video is available directly over the internet , its being viewed on PC's rather than TV sets, anyone can create and upload it, and YouTube is this year's hot market leading name in this space for all kinds of short videos. Its also trivially easy to take a full length movie or TV program and share it using one of the many BitTorrent services, and a growing proportion of movies are being watched for free, to the consternation of the movie industry.

The PirateBay site in Sweden was recently shut down and charged with copyright violation, but it appears that a significant proportion of the population of Sweden were users and they got upset as they had got used to exchanging content for free. After three days the site came back up, hosted in Holland, and with even more users due to the publicity.

Unlike YouTube, BitTorrent sites such as PirateBay don't host the actual content, they just connect individual users who exchange content, they don't need to provide storage or bandwidth, just a searchable database of small index files that configure the BitTorrent transfer between a large number of seeders that have some or all of the file already, and leechers who want to get the file, and who can in turn become seeders.

The publicity gained as a side effect of trying to shut down the PirateBay site may even have the opposite effect of cementing the PirateBay brand as a market leader and accelerating growth in this space.

Every step in this history involves a disruptive innovation. There is a fundamental reduction in cost, offset by a large increase in unit volume, which has often increased the overall revenue using a new way to monetize the market for moving pictures. Each time the previous market leader is left behind (often kicking and screaming) as the new larger market emerges. Each time a new brand captures the attention span and trust of the consumer, and dominates the market.

Part 1: Disruptive Innovation in the path from technology to brand - a maturity model | by Adrian Cockcroft | June 3rd, 2006

Products aim to fill a need in a market, products that are disruptive innovations also reshape the market and markets tend to evolve in a series of discontinuous steps as they mature. The phrase "crossing the chasm" has been used to describe these changes, and "early adopters" are the people who first move a market to a new phase.

In the next few posts I'm going to describe a generic maturity model that applies to many markets, and show how disruptive innovations may drive a market into a more mature phase. I got the initial idea of looking at markets in this way from Dave Nocera of Innovativ in a presentation he gave at SUPerG in early 2004. He used video as an example, with the move from VCR to Video rental to Online. I have extended that example, and come up with a generic maturity model based on it, which I also apply to the Telco industry.

Monday, May 15, 2006

See you at the developers conference? | by Adrian Cockcroft | May 15th, 2006

The combined eBay, PayPal and Skype developer conference is coming up, June 10-12 in Las Vegas. I missed the event last year, but I will be staffing it this year! A few of us are being let out of the mysterious eBay Research Labs for the occasion. They told us to get to work on the future of e-commerce, and have kept us locked up for months, shipping in occasional supplies of Starbucks and fresh interns. I've been writing serious amounts of code for the first time in years, if fact I'm too busy writing Java to have time to go to JavaOne this week.

The conference is supposed to illuminate questions such as:

- What will the next technology revolution be

- How will it impact commerce and communications on the web

- And what opportunities will it provide for developers and technology innovators

- How will the Long Tail Theory play out

- Web 2.0 and how to build revenue streams

Its become a common joke to keep incrementing this: Web 2.1, Web 3.0 etc. but personally I think the most interesting developments aren't even Web based. e.g. Skype isn't a Web application, it defines its own virtual private peer to peer fabric that overlays the Internet.

See you in Vegas!

Wednesday, April 19, 2006

Blogging Tools | by Adrian Cockcroft | April 20th, 2006

I've been using blogger for the last 18 months, it was an easy way to get started but now I don't see some of the features I want. The basic service has changed very little in that time, so it doesn't seem to be getting much investment and development.

The three missing features I see in other blogs are tags, a blogroll, and posting categories.

I want to have an easy way to add a series of tags to each blog entry, without having to create custom html. I did it once the hard way and don't usually bother.

I'd like a blogroll so that people can see which blogs I think are worth reading, but I don't want to edit my html template to get one, I want to import OPML or have a table to edit.

I'd like to be able to separate categories so that I can label rants like this separately from technical info on capacity planning, thoughts on the industry, personal stuff.

I like the web based blogger service, I can post from anywhere using any device (I've posted to blogger from Linux, Solaris, Windows, Mac and Treo/palmOS). I don't want to host my own blog or have to install a blogging tool.

I use bloglines as an aggregator to read blogs, I could also use bloglines to host my own blog, since it does seem to have some of these features, and it would make referring easier.

What other options are out there, is there a slightly better blogger competitor that I should check out? Is there a way to migrate existing entries to a new blog? Comments requested...

Cheers Adrian

Comparing Smart Mobile Phones | by Adrian Cockcroft | April 19th, 2006

Its been a while since I last posted, mostly due to a long vacation. We stayed with friends in New York for a few days, spent 10 days on Bermuda (very nice and relaxing) and spent a few days in New York again on the way home.

In the last week or so I have been trying out a new phone. I got a Nokia 6682 which runs the Symbian S60 operating system and which has fairly good third party support for applications that use Java, Flash and Opera. My history with phones started out with Nokia for many years, and then I switched to the Treo line. I've had a Treo 270, 600 and currently have a 650. Going back to Nokia was is some ways familiar, the user interface has some similarities to the older days, but overall I miss my Treo and I'm going to switch back. I thought it might be interesting to discuss the differences and what I think a state of the art smartphone should be able to do for me.

The Nokia 6682 has a decent spec, large color screen, 1.3Mpixel camera and a 64MB removable flash card included. The spec page even says that you can "Bid on eBay on the go" but that is available to any phone that can browse to wap.ebay.com. With a normal Cingular GSM service its not using a 3G high speed network, so data network access is similar in speed to the Treo 650.

My main problem is that I've been spoilt by the Treo's touch screen and keyboard. When using the Nokia, at first I was poking at the screen in vain trying to select things. The screen is fairly high resolution, but its an eye test in that the text size is too small for many features, and the colors available in the default set of themes have poor contrast. The Nokia is actually much harder to read. For text entry I'm actually used to using Nokia's predictive text feature but it is still extremely painful to enter a text message or URL.

The Treo' browser (Blazer) is easy to use but is not well supported in terms of javascript and many sites don't recognize it properly. On the Nokia there is a built in browser (called Web) and Opera 7 is included, with a free upgrade to Opera 8.5. I found the "Web" browser OK to use, but Opera was very annoying and unintuitive. With the Treo I can quickly get on the web to look something up, it just takes too long on the Nokia, and with Opera I found the navigation commands to be confusing and awkward. I tried to make use of the javascript support in Opera, but it didn't work for Google maps, despite Google claiming that Opera 8 is supported.

I also had problems with the Nokia after browsing the web. The phone keeps applications running in the background and tends to run out of memory at awkward moments. I tried to use the camera, but at the point of taking a picture it failed with a lack of memory. I had to bring up the web browser and explicitly exit it, meanwhile the photo opportunity had gone. The Nokia's camera is higher resolution and has continuous zoom that works for video as well as pictures. For the Treo, you have just 1x and 2x zoom settings and a 640x480 resolution. Its not really enough to snap pictures of whiteboard scribbles clearly. The Nokia has a sliding cover for the phone, which activates the camera when opened, but its too easy to open by accident when getting the phone out.

Other phones I've seen recently include the Verizon LG VX9800 which is a very fat clamshell with a nice big keyboard, 3G networking and a real eye-test of a small hi-res screen. A friend got one but is taking it back, its more suited to gaming and entertainment than business. My son has a Motorola SLVR L7 with iTunes and seems happy with it. It looks cool, fits his interests but he doesn't try to use the web from his phone.

Some co-workers have Windows mobile phones, I haven't tried to use them myself, but I've heard a mixture of good and bad comments, opinion seems very polarized as love it or won't touch it..

So in summary, the things I can't do without on a phone are a touch screen, full keyboard and a large (not just high resolution) display with fonts and icons that can be read easily. The things I don't like about the Treo 650 are its lack of support for Opera 8 (which may be more usable with a keyboard and touch screen), Javascript and Flash.

My favourite applications on the Treo are the Chatter email client, Planetarium for identifying stars and planets, and Solitaire for mindless time wasting - which would be a pain to play without the touch screen. I also find that the mobile version of bloglines works well with the Treo's browser, so I can keep up with my feeds.

Tuesday, March 14, 2006

How to finish writing a book | by Adrian Cockcroft | 15th March 2006

I've written four books, and several years ago I developed "Cockcroft's law of book writing". This states that a book will grow in size as you write it, and that the number of pages left to write will increase as you write. This seems counter-intuitive, but it has been confirmed many times in practice. I hope this posting provides some useful advice for writers, and helps people finish what they have started.

To make a concrete example, let's say you decide to write a book and you come up with an outline that adds up to 200 pages. You start work and write 50 pages, then, when you revisit your outline to update the page count estimates, you find that they now add up to 300 pages. You wrote more than you expected to cover each subject, and discovered more subjects that needed to be discussed. The essential problem here is that there are now 300-50 = 250 pages left to go. Before you started you only had 200 pages left to go.

This problem is recursive, if you write another 50 pages you will find that you have now written the first 100 pages of a 400 page book, and you now have 300 pages left to write. This explains why there are so many people who have written part of a book, but never finished it.

The aproach I took in writing my later books was to maintain a spreadsheet that tracks the pages left to write or edit, update it very regularly, and generate a plot with a trend line from the data. You can then see when (or if) you will finish the book. In order to get the trend line to target a specific delivery date, you have to force the number of pages left to go down. You do this by writing pages that you promise never to edit again, and by deleting whole sections and chapters. I deleted three entire chapters from one of my books to get it finished.

Another problem you can run into is that the content you wrote at the start of the process is less well written than later content, so you think you have finished, re-read parts of the book that were finished ages ago, and discover that it needs a complete rewrite.

I often get asked if I will update my Sun Performance and Tuning book, and I don't intend to do a third edition. This is mainly because I'm interested in other things, and I'm no longer up to date with the subjects I would need to cover. I have sketched out a possible book on capacity planning with free tools, and the trend line on that book is nice and flat. I haven't really started writing it, and so it hasn't started getting bigger yet....

My good friends Jim Mauro and Richard McDougall are closing in on the end point for Solaris Internals 2nd Edition. I've been looking forward to it for a while, and its going to be a monster book, covering how Solaris 10 really works, lots of DTrace based examples, and is going to be the essential companion for anyone looking at Open Solaris.

Saturday, March 11, 2006

Strange Contextual Ads | by Adrian Cockcroft | 11th March 2006

The eBay contextual ads seems to be fixated on the word "margin" which is not present in the content of my blog. However the CSS template that is used to host this blog contains the word margin over and over again. I think Alex needs to do a better job of filtering out formatting words before he does his context analysis....

Thursday, March 09, 2006

Contextual eBay adverts with ctxbay | by Adrian Cockcroft | 9th March 2006

The winners in the eBay developer contest were announced at ETech, one was Alex Stankovic, who has developed a contextual advertising system for eBay that works just like Google Adsense. I just changed my advert bar for this blog to use his system at ctxbay.

At the ctxbay site you find a link to sign up with eBay for the affiliate program (via Commission Junction) this was an easy fast signup, and gets you an affiliate id number. You then create an account at ctxbay, and enter your affiliate number, which they will then use to call back to Comission Junction and make sure you get paid.

The rest of the setup is similar to adsense, however you do need to login to ctxbay with your new account, and it didn't do this automatically for me. ctxbay generates a selection of common ad frame formats using javascript that can be slotted into your site template. I found one identical to my adsense format, and swapped out the code. My first attempt didn't work because I was not logged in, and the id field in the javascript was empty. After I logged in I got a fairly long string that keys my ad to ctxbay.

I setup adsense in order to understand it better, and going forward I'll see if ctxbay can generate any sensible eBay items out of the keywords in my blog.
,

Wednesday, March 08, 2006

Etech Tuesday On Rails | by Adrian Cockcroft | 8th March 2006

A long day with lots of interesting talks, and I got to chat with several new people and also to try out my FLORWAX pitch. "Its the equivalent of AJAX but for Wireless" is my instant summary. To get this out of the way, the general reaction is that the name gets a chuckle (not too many groans yet), that there really is a big problem in wireless platform fragmentation, and that no-one seems to know of any other initiatives that have picked on this as a problem to solve. I think most people look at wireless, see this problem, and give up, as its too hard to make something work. Since I'm interested in a longer term perspective than most people, its seems fair game to try and provoke a discussion on what a sensible core set of wireless platform technologies would look like.

As Jesse James Garrett said in the tutorial yesterday, the key elements of AJAX are that it uses a common standard bundle of browser based technologies and that it is asynchronous, so you don't have to click-and-wait......click-and-wait......
If we apply these principles to Wireless, we need to define a bundle of standard technologies (I suggest Flash Lite 2.0, Ruby on Rails, XML web services - FLORWAX = FlashLiteOnRailsWirelessAsynchronousXml, but the actual bundle doesn't matter as long as a common set emerges). However the asyncronous problem is far worse in wireless than in desktop applications, we really need to have wireless apps that talk to the backend and update the screen without the click-and-wait-for-ages mode that is the norm.

At the end of the day I attended the Ruby on Rails BoF. I have heard good things about RoR but haven't used it. I think they converted me, and I took the opportunity to mention FLORWAX to the group. It does seem like the right technology fit.

The conference itself started with Ray Ozzie showing how to do cut and paste on the web. It seems so trivial, why hadn't been done before? A very useful way to make web apps behave more like regular apps. We then had a very cool hardware demo by Jeff Han, he has a touch screen that can see all his fingers separately and has created a very nice new set of user interaction paradigms.

Amazon has created a way to harness real people to do the stuff that AI can't do. Its called the Mechanical Turk, and its another simple idea with quite profound and wide reaching uses. Dick Hardt from Sxip gave an interesting talk on indentity, but the way he presented it with one word per slide and rapid fire transitions reminded me of Steve Colbert presenting his "The Word" section on The Colbert Report. I enjoyed it but I don't remember much of the content.

Next we has a talk from Felix Miller of last.fm on how they collect the metadata on what you are listening to and use it to help you find new music, I've been playing around with Pandora and training it to play the music I like, and I think I'll have to have a go at last.fm as well. I have eclectic tastes, and its hard to keep the recommendations from veering back to the mainstream in Pandora.

After the break, there were several presentations that didn't grab my attention or told me things that seemed obvious to me. The highlight was a presentation on Second Life that was presented using a billboard in the virtual world and lots of interactive explanations of how it all works. Fascinating, but I don't have enough time to play as much as I'd like in the real world....

, ,

Tagging Blogger and Technorati | Adrian Cockcroft | 8th March 2006

Blogger doesn't provide an integrated way to tag my postings (as far as I can tell) but at ETech we were advised to tag our blog entries with Etech and Etech06 so I tried to figure this out during one of the less interesting talks (yes I know I should have figured this out ages ago). Despite a slow and intermittent wireless internet connection I found that I could use Technorati to do this by embedding some html in my blog. The frustrating problem I ran into was that Technorati seemed to be completely overloaded and was largely unresponsive. When I did finally get my blog entry tagged I went to Technorati and searched for Etech, and after a long wait it came up with no results at all... I figured I had done something wrong, but late at night the load dropped off and the site now does actually find Etech tags, including my own one. I celebrated by adding "florwax" as a tag, so we will see how that goes...

Another site that seems a victim of its success is Myspace, their music delivery service has become overloaded, so its hit and miss whether you can get any songs to play at the moment.

,

Monday, March 06, 2006

Thoughts from ETech - Tutorial Day - FLORWAX? | by Adrian Cockcroft | 7th March 2006

I'm at the O'ReillyEmerging Technology conference in San Diego, today was "Tutorial Day" and I decided to attend "Designing the next generation of Web Applications" in the morning and "Next Generation Flash Development with Flex" in the afternoon.

The morning talk was very nicely presented, Jesse James Garrett coined the term AJAX amongst other things, and Jeff Veen worked on Hotwire, Blogger and Measuremap, they provided a structured set of best practices for designing web applications with lots of great examples and anecdotes.

AJAX acts as a convergence point for browser based applications because almost all current browsers support the same set of technologies and there is a highly functional lowest common denominator. There is now also a large body of applications that provide the intertia or value that constrains the browser writers from diverging with incompatible functionality. This is the same effect that occurred in the PC marketplace, when MSDOS and Windows developed enough application value that neither Intel nor Microsoft could diverge in an incompatible manner. The collective self interest of the end user reaches a tipping point that blocks radical innovation, and slow incremental evolution takes over.

The interesting area for me is how this maps to the mobile/wireless space. There is no AJAX for wireless applications, the market is huge, but the platform diversity is also huge and is growing. One estimate I heard was that there are 1000 separate platforms to target and that this number is growing, not shrinking.

So what we need, is the equivalent of AJAX for Wireless, something like Flash Lite On Rails Wireless Asynchronous XML - FLORWAX - which also has a household cleaning connotation :-)
If enough people standardize on a common set of technologies, then the handset vendors will start to build to a common profile as well, and we could end up with a decently functional lowest common denominator.
, ,

Friday, February 24, 2006

Conferences and Innovation

I just signed up for the O'Reilly Emerging Technology event in San Diego next month - http://conferences.oreillynet.com/etech/

I've also written a paper for a workshop in the IEEE Joint Conference on E-Commerce Technology (CEC'06) and Enterprise Computing, E-Commerce and E-Services (EEE'06) http://linux.ece.uci.edu/cec06/ - but the conference name is so long that I can't remember it very well in conversation. This conference also includes the 2nd International Workshop on Business Service Networks (BSN '06) and the 2nd International Workshop on Service oriented Solutions for Cooperative Organizations (SoS4CO '06). Its all sounds very interesting, its in June in San Francisco, and needs a snappier name...

Last December I attended the Fortune Innovation Forum in New York. It was very nicely put together and in effect it validated the approach we were already taking. It seems that most of the attendees were trying to work towards a culture, process and tools for fostering innovation that seemed similar to our own setup. eBay and PayPal were used as examples several times.

We used a few simple techniques last year to kickstart our own innovation program. One method I borrowed from other events is the "Poster Lunch". Get a room near the company cafe, provide flip chart sized pads and pens, email everyone to tell them about it and put up signs in the Cafe to invite them in on the day. Anyone can put anything they like on a poster, stick it up and collect comments on it in person. One thing we found was that there were several posters suggesting eBay site features that already existed or were in development. One suggestion in particular was getting lots of support and comments until someone wrote on it "LTS thursday!", meaning it would be Live To Site and be launched two days later. We also gave attendees voting stickers so that they could indicate their favourite posters.

To drill down on the best ideas we also setup a regular open-to-all meeting where we could discuss the concepts and route them to the appropriate expert or business owner. The most far-sighted ideas get routed to become candidates for research labs projects, and the people who had the ideas get to develop them further.

To support the collection of ideas, we created a Wiki. This is nice because it is free format, and supports comments and discussion, with very low initial barrier to entering an idea. The problems came when there were several hundred ideas in the Wiki, it became hard to maintain. A more specialized pre-concept tool that feeds into our standard development process is a better solution for incremental innovations, and the Wiki works better for more radical ideas.

To really get a dose of innovative ideas, last year I attended a seminar on Complex Adaptive Systems by the Santa Fe Institute. It was a real eye-opener, they are pushing the boundaries of multi-disciplinary research, e.g. forming teams with Physicists, Biologists and Economists to derive the rules of scaling and organization of living things, from the smallest mammal to the largest city. Since eBay, PayPal and Skype are social networks, (their value comes from connections within their communities) they behave in some ways like cities, and follow similar kinds of scaling rules.

Saturday, February 18, 2006

Changing gears

I started this blog in the summer of 2004 when I had finished at Sun and not yet started at eBay. After 16 years at Sun this was a big move. I knew people at eBay from the time in 1999 when they had a big outage and many Sun people got involved in helping them get up and running again. My thinking that summer was that web services platforms were where the real innovation was taking place, and I see eBay and PayPal as the leading transactional web services platforms.

My first year at eBay was in the Operations Architecture group, where I was working on figuring out new platforms and upgrades, and helping with capacity planning tools and processes. I also figured out a lot about how eBay and PayPal really work, and the challenges of scaling a rapidly growing and changing high availability transactional platform to a size that is beyond most people's comprehension. I had some entertaining meetings with hopeful vendors who would come in with solutions to common industry problems (e.g. low utilization) that eBay doesn't have, and their largest existing deployment would be an order of magnitude too small to be useful. After describing a bit about how eBay works, they would get big eyes, admit that their product wasn't appropriate, and wander off to look for more normal customers... A lot of what eBay does is built internally because the generic products don't scale and we can build what we need ourselves for less.

In the summer of 2005 I moved internally to help form eBay Research Labs. Since then we have hired some very experienced researchers and are becoming the focal point for innovation within eBay. This was another opportunity for me to change gears and greatly increase the scope of my work. Part of my role is to continue to research new platforms and technologies for the datacenter operations, and I've been joined in this work by my friend Paul Strong. Paul was in the N1 group at Sun, and is also the drummer for Fractal. Paul and I were both involved in the Enterprise Grid Alliance, he ended up as chair of the Technical Steering Committee, and edited the EGA's Grid Reference Architecture. He's now working on how to enhance the automation of eBay's datacenters.

The other cool thing that came my way in 2005 was eBay's purchase of Skype. Its not just a VOIP tool, its a huge and fast growing community (something eBay understands very well) and an extremely innovative development platform. The Skype API is a fun place to do innovative research, and the Skype network has between 3 and 5 million active nodes at any point in time (up by a million in three months). I've been interested in the telecom market ever since I was one of the Sun Systems Engineers working with British Telecom in the early 1990's. Now I get to play with the future of telecom in the form of Skype, and I'm also very interested in mobile/wireless applications.

In another sense I am changing gears with this post. I've changed the title and description, and it is now also being included in the Best of eBay Blogs site. I've been encouraged by the example of other bloggers at that site to discuss a bit more openly what I get up to, but if you ask me what I'm really working on, all I can say is "The future of e-commerce".

Cheers Adrian

p.s. I just tried to spell-check this posting, and the built-in spell checker at blogger.com decided that the first error was the word "blog", which I find highly amusing, so I gave up and any spelling errors in the above are my fault.

Monday, January 30, 2006

Interesting hardware for database servers

I've been too occupied on other things to keep posting regularly in the last month. The good news is that I'm learning a lot about some new areas.

So what is new in hardware? I think there are some interesting trends in server hardware for running databases. The cost base of a mid-sized Solaris/Oracle server with 32-64GB of RAM is dropping fast. A very common platform in this space has been the 8-way UltraSPARC III based V880, moving to the 12-way V1280 and currently the E2900 (a 24 core V1280 chassis with UltraSPARC IV) over the last few years. Prices vary by configuration, and newer systems give you more performance per $, but are of the order of magnitude of $100K (plus the disk subsystem and software licenses - but thats another topic).

The two new entrants in this space are Niagara based systems and Opteron based systems, each has its strengths. When loaded up with RAM the costs are largely dominated by the price of RAM rather than the CPU itself, however both these systems use commonly available DIMMs, rather than the more specialized and expensive memory of the older generation systems.

Niagara has everything on one chip, 8 cores and 32 execution threads. The cool thing about this for database is not the low power consumption touted by Sun (which is dwarfed by the disk subsystem for a database application) , but is that any inter-thread locking will be blindingly fast since the signals do not have to go off-chip. Badly behaved applications that are sensitive to high memory latency (the kind that don't scale well on physically bigger systems) will run relatively well. However the cores themselves are not particularly fast and are atrociously slow for anything that does floating point, so single stream performance is not a strength. With 2GB DIMMs you can get 32GB RAM connected to a single Niagara chip, this should move to 64GB using 4GB DIMMs eventually. The performance of a Niagara seems to be a bit better than a V1280, but the cost is much much lower. Software support for SPARC Solaris 10 doesn't seem to be an issue at this point. Most things are supported and the system is compatible with earlier releases of SPARC/Solaris products.

The common Opteron systems are two socket/four core with a maximum of 16GB with 2GB DIMMs. There are some four socket and eight socket systems available from several vendors (including Sun), with 32-64GB, moving to 128GB with 4GB DIMMs. The Opteron seems to have performance per GHz in the same ballpark as UltraSPARC systems, so 8 cores at 2.4GHz would be between the performance of a V1280 and E2900. The 32GB 8-Core Opteron systems are in the same order of magnitude for performance and price as a 32GB Niagara, but far faster for single stream work and floating point, and relatively slower for lock intensive workloads where the signals have to move between the Opteron chips. On Opteron the software situation is a little different, there are three possible operating systems - Solaris 10, Linux and Windows 64. Solaris support isn't as good as it is on SPARC, for example Oracle 10g is the only option, the earlier releases of Oracle don't seem to be available. Linux probably has the widest choice for support, but 64bit Linux on larger systems doesn't seem to scale as well as Solaris 10 in my experience. Linux tends to be more efficient than Solaris on 32bit systems (your milage will vary, it depends greatly on what features of the OS your workload hits hard). I don't know anything about Windows 64, but I expect these large Opteron systems will be good SQLserver platforms.

Thats what the landscape looks like to me as we go into 2006. I hope to be doing some testing later this year to compare all the options, including Intel's next generation servers, to get my performance and price comparisons to be more precise than the general comments above. I'd be interested to swap experiences with other people moving in this direction.