PITADEV
Curiosity killed the developer's project.

Raw Power

Wednesday, 27 May 2009 12:56 by aj

One of the problems that comes along with writing my own blog on software development is that I often expose my own ignorance.  I don’t pretend to know everything, by any stretch of the imagination, but writing stuff that anyone can read comes with a modicum of “assumed authority,” which I clearly don’t have.  However, I sometimes stumble across things that are so cool that I just have to write about them, no matter how behind-the-times they may make me look.  So, I’ll just come out and say it now: Powershell rocks my world.

Yesterday I was dinking around with a web-forms app that required some mildly tricky string manipulation.  When coding in Visual Studio, I’ve often found myself wishing that there were some way to use the Immediate Window to evaluate runtime behavior at design time.  For example, I can never remember string formats.  The .ToString(“format”) method is extremely common, but I’ve already got too much crap crammed in my skull to remember all of the possible values for “format.”  What I’ve always done in the past is fire up my project in Debug mode, put a breakpoint somewhere, and then use the Immediate Window to figure out which argument I need to pass to ToString().  There’s probably a better way to do this in general, but somehow my string of searches yesterday led to Powershell, and goodness am I happy about it.  It’s not that I didn’t know that Powershell existed.  Brad’s been singing its praises forever.  I just never took the time to understand how powerful it is. 

A known bug with the Immediate Window that I don’t believe has been solved yet is the “'RegularExpressions' is not a member of 'Text'” exception.  If you’ve ever tried to execute RegularExpressions in the Immediate Window, you’ve probably seen this.  After downloading and installing Powershell, I started out working on how to test the evaluation of a Regex.Replace() statement.  First, I had to figure out how to get the System.Text.RegularExpressions namespace loaded.  I found a nice blog post that taught me how to do that, and I’m pretty sure that you can have a Powershell script execute on launch as part of your profile.  This little guy will definitely be a part of it when I have time to figure it out.

   1: PS > $GacRootDir = Join-Path -Path $Env:SystemRoot -ChildPath "Assembly/Gac"
   2: PS > Get-Childitem -path $GacRootDir -recurse -include *.dll| Foreach-Object {$_.FullName} | Foreach-Object {([Reflection.Assembly]::LoadFrom($_))}

 

After that, most of what you can do is a matter of learning the syntax, and for a guy who spent the first six-odd years of his programming career using VIM in command prompts, I must admit that it’s strangely comforting to get back to a little command-line scripting.  Once my GAC assemblies were loaded, I set about solving my string manipulation problem.  Specifically, I needed to set a NavigateUrl property to a UNC path, which then got sent to a JavaScript method.  The code-behind is VB, but JavaScript, of course, likes it’s back-slashes to be escaped like C#.  For some reason my brain thought that I should use Regex.Replace() for this (maybe scripting reminded me enough of Perl that I wanted to do a little $foo =~ s/\\/\\\\/g; or something).  Testing this in Powershell was simple:

RegexReplacePS Lovely.  Really, it was.  I didn’t have to fire up a debugging session to find out that Regex.Replace() is going to be funny about back-slashes too, which, if I’d actually taken two seconds to think, I would have already known.  Then, I remembered String.Replace():

PSStringReplace I know, I know, this is elementary stuff.  But the beautiful thing to me here is that Powershell let me do this in a matter of seconds at design-time rather than having to write a separate application or fire up the debugger

Even more fun, I learned how to load unsigned .Net assemblies (and COM!) into Powershell so I can run methods for testing and support issues without having to write a test app.  Simply using System.Reflection.Assembly, same as I would to load an unreferenced assembly in code, I can initialize a new object and then view its members and execute its methods. 

So, maybe I just made myself look ignorant, but I needed to say something because I’m so excited to be finally learning to work with this powerful tool.  If you’re a .Net developer and you’re not already using Powershell, I highly recommend that you go and check it out.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:  
Categories:   .Net | Tools
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

A Necessary Evil

Tuesday, 19 May 2009 08:50 by aj

Creative Commons user Paulus Veltman

One of the most well-known code smells in .Net development is explicitly messing with the Garbage Collector.  In my coding travels, I’ve had at least a couple of situations where a performance/memory consumption issue was traced back to a GC.Collect() call.  In my own applications, I’ve generally followed Rico Mariani’s Rule #1, “don’t use it.”  I figure the .Net Framework team has a much better idea of how to handle that sort of stuff than I do, and for the most part, their ideas work just fine.

I’ve been working on a project for the last month or so that requires the use of an OCR API to read type-written words from TIFF images and then run them up against a set of regular expression patterns looking for useful data.  There are several OCR libraries available for purchase, but after significant testing and analysis, we decided that the Microsoft Office Document Imaging (MODI) library that comes with Office 2003 works about as well as anything else, and has the added benefit of being free, since our workstations haven’t been upgraded to Office XP.  I spent some time figuring out how to use MODI for OCR, then found a nice tutorial on Code Project that would have saved me some time.

Our TIFF images are stored in a homegrown database/file system setup that, after years of working with FileNet, is a cool breeze on a hot day.  Through more prototyping, I found that MODI OCR’s performance was best if I copied each multipage TIFF file to the local system, split them into single page files, and then ran each file through the OCR process, cleaning everything up afterward.  Here’s a chopped down version:

   1: MODI.Document modiDoc = null;
   2: MODI.Image modiImage = null;
   3: MODI.Word modiWord = null;
   4: List<String> filesToProcess = null;
   5: try
   6: {
   7:     filesToProcess = SplitTif(inputFile, workingFolder);
   8:     foreach (var fileToProcess in filesToProcess)
   9:     {
  10:         try
  11:         {
  12:             modiDoc = new MODI.Document();
  13:             modiDoc.Create(fileToProcess);
  14:             modiDoc.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
  15:             for (var i = 0; i < modiDoc.Images.Count; i++)
  16:             {
  17:                 modiImage = (MODI.Image)modiDoc.Images[i];
  18:                 for (var j = 0; j < modiImage.Layout.Words.Count; j++)
  19:                 {
  20:                     modiWord = (MODI.Word)modiImage.Layout.Words[j];
  21:                     if (System.Text.RegularExpressions.Regex.IsMatch(modiWord.Text, regexPattern))
  22:                     {
  23:                         //Do stuff that needs to be done when there's a hit
  24:                     }
  25:                 }
  26:             }
  27:         }
  28:         finally
  29:         {
  30:             if (modiWord != null)
  31:                 DisposeCom(modiWord);
  32:             if (modiImage != null)
  33:                 DisposeCom(modiImage);
  34:             if (modiDoc != null)
  35:                 DisposeCom(modiDoc);
  36:             modiWord = null;        //This may be redundant...?
  37:             modiImage = null;
  38:             modiDoc = null;
  39:         }
  40:     }
  41: }
  42: catch (Exception ex)
  43: {
  44:     //handle it
  45: }
  46: finally
  47: {
  48:     //Cleanup temp files
  49:     if(filesToProcess != null && rdoNIS.Checked)
  50:         filesToProcess.ForEach(System.IO.File.Delete);
  51: }

Notice all of the cleanup.  I’ve learned the hard way that it’s never a bad idea to be very explicit about cleaning up COM objects, especially since a lot of COM objects don’t implement anything close to IDisposable.  So, with this code, I thought I was good to go.

And for the most part, I was.  My company has two main data centers: one in the building where I work, in a city I’ll call St. Small, and one that’s roughly 1,300 miles away, in a city I’ll call Sunville.  The TIFF images and the database I’m persisting data to are in Sunville.  The workstations that I started out running my client application for the OCR process are all in St. Small.  They worked fine, albeit slowly due to network latency and crappy hardware. But I wanted the process to run faster, since we have a lot of images to sift through, so I nabbed up the only workstation I could find in Sunville, set up my client, and started to run a batch.

I was quite surprised when I repeatedly and inconsistently received OutOfMemory exceptions from the MODI OCR method.  I checked all of the system resources, running programs, and RAM, and everything looked fine.  It’s running a dual-core processor at 2.4 GHz with 2 GB of RAM, which should be totally adequate for the MODI process, right?  Wrong.  No matter how hard I tried, I could not get these exceptions to go away.  What was even more interesting is that I wasn’t getting the errors on the workstations in St. Small.

So what’s the difference?  Duh.  Since the Sunville workstation doesn’t have nearly as much network latency to deal with, it does run quite a bit faster.  Since it’s able to run faster, it’s creating and destroying MODI COM instances much more frequently than the copies running on the St. Small systems.  So I did some more web research, more testing, and with a cringe, I added this line of code (and the comment) to my finally block: 

   1: //THIS IS VERY VERY BAD YOU BAD BOY
   2: GC.Collect();

I tested it on my workstation and performance didn’t seem to suffer too much.  So I dropped the new version of my application on the Sunville workstation and fired it up, thinking I’d solved the problem.

Nope.  The Sunville system still threw random OutOfMemory exceptions.

Creative Commons User Photos o' Randomness

What gives?  I thought GC.Collect() was the magic baseball bat that beat the crap out of everything?  If it isn’t, why is it so terrible to use it?  Well, the answer is, it is terrible to use.  Read Rico’s post and the many other articles on Garbage Collection, if you don't believe me.  But in my situation it seems necessary, since we have so many images to process and I don’t want to babysit every instance of the application.  I still had the problem, though.  Why wasn’t GC.Collect() working?

Because I wasn’t using it correctly, that’s why.  The client application I wrote was a quick and dirty Windows Forms app, so all of the MODI OCR calls were synchronous.  GC.Collect(), on the other hand, is not.  However, you can force it to be synchronous by adding one line of code, which I did, and now my application runs wherever I want it to.

   1: //THIS IS STILL VERY BAD AND YOU ARE STILL A BAD BOY
   2: GC.Collect();
   3: GC.WaitForPendingFinalizers();

It’s funny—I can actually see the points when the code execution is sitting on that line.  It doesn’t happen often, but it clears up all of my errors.  If anyone knows a better way, I would love to hear it.  I don’t want to use it, but for this project, I’ve come to believe that forcing garbage collection is a necessary evil.

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:   , ,
Categories:   .Net
Actions:   E-mail | del.icio.us | Permalink | Comments (2) | Comment RSSRSS comment feed

The Essence of Punk

Saturday, 16 May 2009 06:40 by aj

One of my favorite things about being a programmer is that I get to listen to music all day.  I know that many of my peers do this as well, and we’ve all seen silly suggestive lists of “the best music to code to.”  I’m old enough and cynical enough to know that my taste in music is better than everyone else’s, so I’m not going to waste any time arguing about what’s good or bad.  It did occur to me the other day as I rocked-out-with-my-Mock-out that there are certain songs that define certain genres for me.  So, I guess I’ll go ahead and start a silly suggestive list.

Musical genres themselves are stupidly annoying.  Show me the person who says they can definitively state the differences between between punk, hardcore punk, and metal-core, and I’ll show you a person who’s argument is easily Swiss-cheesed.  Genres are different for every person.  What’s punk to me may be rock and roll to someone else.  An original punk rocker might listen to some of these songs and say, [In sloppy British accent] “Whot tha fock, that’s fockin’ rocknroll, not punk!” 

So, with that lengthy and broken disclaimer made, I now propose three songs that, to me, are quintessential punk rock.  When I think of punk, I think of songs like these:

Scared of Chaka – “A Lie and a Cheat”

This is the song that made me think about doing this list in the first place.  The classic, lo-fi sound, pounding drums, and ripping guitar, along with the understated distorted lyrics make this song a lot of fun to bounce around in an office chair to.

 

Dillinger Four - “Mosh for Jesus”

Sure, they’re from the same city as me, but I still think that Dillinger Four is one of the exemplary punk rock bands of the last 20 years.  The sharp contrast between vocals and the unapologetic rampage of guitar and drums always get me nodding my head.  Besides, what a great song title!

 

The Arrivals - “Born with a Broken Heart”

Not only is this a great punk song, it’s a great rock song.  There I go, mixing genres again.  Swiss-cheese is me.  The bridge alone in this song makes it brilliant.

 

 

Those are the only three I can think of right now, but I’ve been missing my old days as a music reviewer, so I’ll probably whip out more totally off-topic and completely subjective posts like this in the future.  Meanwhile, dear readers (assuming I have more than one), what songs scream PUNK to you?

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Some Things Are Worth Paying For

Friday, 15 May 2009 06:42 by aj

DualMonitorsWork

I am a huge proponent of multiple monitor setups for developing.  The more real estate I have, the more productive I am, and the happier I feel.  Sure, it’s hard to come up with metrics for this sort of thing, which is exactly why many companies (that aren’t software companies, most of whom understand that hardware is cheap and brains are not) won’t spring for extra monitors for their programmers.  My current company is that kind of company.  I even went so far as attempting to find ways to convince them to invest in things like multiple monitors and ReSharper, but have been so busy that I haven’t been able to put Mr. MacIntyre’s plan into action.

My boss, however, is quite reasonable about things like this, so much so that he went to an auction site and used his personal money to buy us each (there are only three of us right now, so he didn’t have to shell out too much) an extra monitor as long as we were willing to chip in for the hardware needed to run it.  We have the standard Dell low-profile desktop workstations with on-board video that only support one VGA monitor, and I didn’t want to crack the case on something that my company owned, so I was presented with a bit of a conundrum.

Then I learned about one of the coolest little products I’ve seen in a while, the EVGA UV 12+.  After a quick driver install, plug this little box (it measures about 3” x 3” x 1”) into a USB 2.0 port, then plug your extra monitor in (DVI is the default, but they provide a VGA adapter), and voilà, dual monitors!  You can even stack more than one UV 12 to run even more monitors.  Our little Dells only push 1440x900, which is the max widescreen resolution on the UV 12 (and is adequate on the twin 19" widescreens I have at work), but if you want higher resolutions you can upgrade to the more powerful UV 16.

So, with a generous boss and about 46 bucks with shipping, I now have dual monitors at work, and this pleases me.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:  
Categories:   Hardware | Tools
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

Stack Overflow Dreams

Thursday, 14 May 2009 11:25 by aj

I, like thousands of other developers, have fallen in love with Stack Overflow.  I still use Google for the random programming questions that my teammates and I come across during the work day, but I am always stoked when an SO link comes up on my searches.  I generally check the site periodically throughout the day, although I don’t have time to troll it (like several other users seem to) and haven’t quite committed to setting up an RSS feed for it.  It is a tremendous resource, and though I was initially skeptical about its completely user-driven format, I’ve come to realize that that is exactly what makes it such an effective site.

For some odd reason, I’ve had particularly vivid dreams lately (no, it’s not as a result of Chantix, although it should be).  The other night I dreamt that I was sitting in my chair in our living room with my laptop on my lap, as I do most evenings after work.  I think we were watching the Twins bullpen throw another game, and my daughters were in their various states of chaos.  All in all, a very normal evening, except I was dreaming it.  Anyway, in this dream, I was in the midst of one of my frequent evening scrolls through the Unanswered question list on Stack Overflow, and noticed that my reputation had suddenly increased by 50 points.  I went to my user-profile page, and it had increased again, along with two new silver badges.  I clicked on the logo to go back to the SO homepage, and my reputation had increased by thousands, with gold badges and silver badges and accolades galore.  I was euphoric! 

Shamelessly Paint.Netted from Skeet's rep

It was such a realistic dream that I actually checked my reputation right away the next morning, only to see the sad reality  that is my true reputation.

Some day, 500!

This got me kind of mad.  I mean, what do I care what a bunch of other geek/dork/nerds think of my geek/dork/nerd abilities?  Just because I don’t hang on every single (often unintelligible) question that is posted shouldn’t make me less important or intelligent than the Stack Overflow Pantheon.  I mean, seriously, do these guys even have real jobs?  Do their bosses know that they’re spending hours of precious coding time scanning for reputation fodder?

Being a ruminant son of a bitch, I reflected on this for a while, and realized that I do care about my Stack Overflow reputation, and that’s OK, because beside being the best place on the web for software developing questions, Stack Overflow has become a center of excellence.  It is the premier community of software engineers on the internet, and having a good reputation among your peers is always a good thing.

Lately I’ve been posting more answers because I’ve decided that, good or bad, answers always help everyone.  Even if they’re bad or incorrect, they are a learning experience for both me and everyone else that sees how the community reacts to them.  It’s like a class that you really want to learn from; good students understand that in order to learn, they need to pay attention and put themselves out there, both by asking and answering questions.  And the brilliance of the site’s underlying mechanisms is that you are rewarded for putting yourself out there.  Every now and then I’ll hit the site and notice that my reputation has been bumped up by ten points because someone up-voted something I posted months ago.  That’s really frickin’ cool, if you ask me.

So, I say kudos to Spolsky and Atwood for designing a site that has become such a powerful community.  And even if my reputation sucks, I’ll still be reading, asking, answering, and ultimately, learning.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:  
Categories:   Around the Web
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed