Archive for April 2008

 
 

.NET Framework Sockets Performance

I was doing tests yesterday in order to measure performance of the sockets in the .NET framework. It kind of looked like Jonah Jameson talking to Peter Parker: Crap, crap, megacrap. I’ll give you three hundred bucks for this one.

In the first test I did synchronous IO. The performance was crap.

In the second test I used overlapped IO using Begin/End socket calls. The performance was crap.

Then I found a new addition to the 2.0 framework: SocketAsyncEventArgs and the SomethingAsyncSomething functions. They were the first ones that provided any half-way decent performance. When did this get added? Any why do all the MSDN samples use the Begin/End functions?

[Update: Looks like this was added in a service pack. Scott Hanselman has a diff of changes between two releases.]

The Zen of SSH

Unix has traditionally better than Windows when it comes to doing stuff on the remote machine. On the GUI side, Unix lets you run just a single app on a remote machine and have its window show up completely integrated with your desktop. On the command-line side, you can SSH to any box and it’s as natural as having a terminal window open on your computer.

The Windows story is much sadder. Remote Desktop is severely crippled unless you are running Windows Server in apps mode, plus you only get the full desktop instead of window integration unless you’re using Windows Server 2008.

Command-line side is equally fucked. Windows console apps don’t use a terminal emulation and there are no hooks using which an application could find out what another app is doing with the console. Thus secure terminal access of a Windows box is only good if you don’t try to use the Backspace key or god forbid Tab for command-line completion. And don’t even think about running an editor on a remote box.

Now the good news: There are two Windows SSH servers that make the whole experience work out OK. They are not perfect but they get the job done.

The free one is FreeSSHd. It has a few rough edges but it’s a usable. The commercial one is VShell, which is better but more expensive.

On the SSH client side the story is the same. The free one is Putty, but SecureCRT is better.

Set PowerShell as your remote shell and accessing a Windows box over the network becomes pleasant again.

Bonus: Once you install a SSH server you get SCP and SFTP. That means that you can use a program like WinSCP to access another computer’s file system remotely without bothering to set up Shared Folders in Windows. You are logging in with an username and a password so whatever directories your user can access you can access too.

Bonus #2: If you ever have to edit a file on a Unix box you can use WinSCP to edit the file locally in your editor of choice instead of suffering through vi.

TV Series Recommendation: The Fixer

I rarely recommend TV series as I assume you watched the popular ones: Battlestar Galactica, Lost, the original CSI, and Desperate Housewives and Grey’s Anatomy for Y chromosome-challenged.

But last week somebody on Releaselog recommended The Fixer and I gave it a try. Definitely a pleasant surprise.

(Side note: The Fixer website uses Silverlight to serve the episodes. This is the first time I’ve seen Silverlight used on a non-MS and non-MS-fanboy property. I may yet get surprised by Microsoft’s ability to battle Flash. Click-Once, for example, was DOA even though it’s an useful piece of technology.)

Emacs for C#

Hot on the heels of Amanjit’s comment to my post about vi and Emacs comes comes this post by Dino Chiesa about making Emacs play nice with C#.

Government-Sponsored Wholesale Torture

How in the world did the Republicans manage to shift the discussion on what they are doing in Guantanamo to a moral discussion on whether torture is right or wrong depending on the circumstances?

There is no right or wrong when you are fighting for survival. If, in some contrived scenario like the one election debate hosts create, some psycho had my daughter locked up somewhere and I had him taped to a chair in my basement, you can bet your ass I’d do whatever it takes to get the information out of him.

But this is not the real issue. The issue is government-sponsored wholesale torture which is a markedly bad idea. There are no moral tangents when torture is done by the government. The guy doing the torture doesn’t have his ass on the line, he’s doing it for fun & profit. “Oops, we screwed up, looks like you’re innocent. Here are your fingers back.”

Straight from the horses mouth

From time to time I’m left scratching my head at some program that I find hideous but people around me sing praises about. I never did ‘get’ the popular Unix editors – Emacs and vi.

I can see how these editors are useful. I can see why they were created. What I could never get is why somebody would throw a perfectly good 21st century editor (Brief, UltraEdit, Visual Studio 2008, whatever) and go back 30 years to vi or Emacs. Either I ‘just didn’t get it’ or they are wanking. Can’t have it both ways.

The answer? They are wanking. From an interview with Bill Joy, the creator of vi:

“It was really hard to do because you’ve got to remember that I was trying to make it usable over a 300 baud modem. That’s also the reason you have all these funny commands.

“People don’t know that vi was written for a world that doesn’t exist anymore.”

I’m not dissing either Emacs or vi. Both are fine, useful programs. Unix admins like vi because it’s the only editor that can be found on practically any machine. Emacs has a great ecosystem built around it. So if you are using them for practical reasons, cool. But don’t make a religion around text-mode command-mode-first editors.

Overhead

Interesting tidbit related to my previous post:

I used the Microsoft HTML control to obtain a DOM of HTML content, which is then straightforward to convert to XML. When I did this conversion from a test stub, it took roughly 1.5 seconds to convert the 80k of HTML on my machine.

Then when I placed the same code inside PowerShell, the conversion time went up more than forty times to 70 seconds. WTF?

I added traces and monitored thrown exception, but couldn’t find anything that would explain this. So I took a break and the solution materialized in my mind: The Microsoft HTML control is registered as a single-threaded apartment object, while PowerShell lives in a multi-threaded apartment. Marshalling and thread-switching costs were responsible for the tremendous slowdown.

The fix, after that, was simple: I created a shim that creates a STA thread and then ran the DOM processing on that thread. I was lazy so I didn’t implement a message loop and a hidden window, as it seems the whole control doesn’t make calls to other apartments and the function calls flow in only one direction.

STAs are evil.

Web Scraping with PowerShell

I created a pair of PowerShell cmdlets that are pretty useful for scraping websites and HTML fragments found in RSS feeds.

The first cmdlet is called Get-Web. It basically wraps the .NET WebClient class and allows you to download a web page as a string or a byte array, or to download a file from the web to a file.

Usage examples:

1> Get-Web http://reddit.com
2> Get-Web http://jelovic.com/binaries/WebScrapingCmdLetsSrc.zip -Binary
3> Get-Web http://reddit.com c:\tmp\reddit.html

The second cmdlet is called Clean-Html. It will convert HTML content to XML which is then easily processed. It takes a string containing HTML content and outputs a XML document. Example:

4> $xml = Get-Web http://reddit.com | Clean-Html

After that it’s a simple matter of using XPath (awesome tutorial by example, thanks Alex!) to get whatever information you need.

For example, let’s fish out all hyperlinks from reddit:

5> $reddit = get-web http://reddit.com | Clean-Html
6> $reddit.SelectNodes("//a") |
   % { $_.GetAttribute("href") } |
   ? { $_.ToLower().StartsWith("http://") }

Instructions for use:

1. Download the cmdlets file. I tend to store my cmdlets in \bin\cmdlets.

2. From an admin command prompt, execute installutil WebScrapingCmdLets.dll in the directory where the file is located.

3. Open your profile.ps1 file (stored in Documents/WindowsPowerShell) and add a line saying Add-PSSnapin WebScrapingCmdLets at the very end.

You should be good to go now. Start a new PowerShell window and you can use them.

The source code is distributed under the FreeBSD licence.

Vista

Alex and I were talking about Vista’s poor market share. It’s been out for more than a year and compared to the uptake of other Microsoft operating systems it’s a dud.

Yet we both agreed that we’d never willingly go back to XP. The start menu searching is pretty darn awesome, UAC makes running Internet-facing applications much safer, PNRP makes it easy to connect to any computer inside my home from anywhere, and on the programming side there are goodies like IO transactions that are now part of the Windows Server 2008.

Vista is an improvement over XP, albeit an incremental one. So why is the perception of Vista so bad? Mostly it’s just Microsoft overpromising and underdelivering, plus with their bad rep everybody else was happy to kick them while they were down.