Ben's profileNo Brain, No PainPhotosBlogListsMore Tools Help

Blog


    November 27

    Windows Workflow, revisited

    So, I convinced myself that perhaps I was overly hard on Windows Workflow before, and made a commitment to myself to revisit the technology and see if I could get a better understanding on exactly what problem it was solving.  This morning I perused through the WF blogs and such I have on my Favorites list and ran into this example of the most simple workflow possible.
     
    Sadly, I pretty much instantly reverted to my concern that WF is just technology for technology's sake; it's not making software development easier, rather it achieves the opposite and makes development more difficult.  Case in point, here's my version of the exact same functionality presented in the aforementioned WF example:
    using System;
     
    namespace SampleApp
    {
       class Program
       {
          static void Main(string[] args)
          {
             Console.Write("Please enter your name and press Enter: ");
             string greeting = String.Format("Hello, {0}!", Console.ReadLine());
             Console.Write(greeting);
          }
       }
    }
    The Windows Workflow version of this program has approximately 800% more code than my version, not counting the XAML declaration of the workflow.  Hell, the XAML itself is almost identical in size to my entire program.
     
    Can someone please tell me what the return is for having to type 8x more code?  I'm certainly not seeing it.
     
    Oh, well.  I'll give WF another stab in a week or two.
     
    Ben
    November 21

    Function entropy, part II

    So now that we have made a basic definition of what software entropy is, what can we use it for?
     
    Thus far, I've mainly applied it to functions defined in source code.  The goal is to identify the "sweet spot" of how much functionality should be implemented within a single function, in order to maximize the maintainability of code, while minimizing development time.
     
    That's a pretty abstract statement, so let's look at some actual examples.  First off, consider the zero-entropy function:
    float func1(float inputValue) 
    {
       return inputValue;
    }
    I think almost everyone would agree that this is a worthless function, as it does absolutely nothing.  Well, that's not quite true--it wastes compiler time, wastes space in the compiled code, wasted developer time when he/she was keying in the function and wastes time when a developer has to step through the code in a debugging session.
     
    But let's go a bit further and consider the following function:
    float func2(float inputValue)
    {
       return inputValue + 1;
    }
    This function certainly has more entropy than the previous one, albeit just barely.  But I'd go as far to say that it's just a waste of development time, too--it doesn't do enough to justify its existence.  Thus I make the following postulate:
    Low-entropy functions should be eliminated, because they don't provide enough value to justify the time taken to create them.
    On the opposite end of the spectrum is the function with too much entropy.  I won't bother typing one up, but imagin a single function with 50,000 lines of code which implements a complete order entry system.  A function such as that would be impossible to comprehend, let alone debug or modify.  Therefor I propose the second postulate:
    Functions of extremely high entropy are difficult to create, debug and maintain, and should be broken down into multiple functions.
    With that stated, it's now time for me to poke fun at the software development mainstream ;)  Take the following class as an example:
    class JustAnAverageClass
    {
       private int _value1;
       private float _value2;
       private string _value3;
     
       public Value1
       {
          get
          {
             return _value1;
          }
     
          set
          {
             _value1 = value;
          }
       }
     
       public Value2
       {
          get
          {
                return _value2;
          }
     
          set
          {
             _value2 = value;
          }
       }
     
       public Value3
       {
          get
          {
             return _value3;
          }
     
          set
          {
             _value3 = value;
          }
       }
    }
    So what's wrong with this code?  I'm sure literally thousands of developers typed up classes similar to that today.  But look at all those get/set methods and what do you see?  Low entropy code!!!!  Wouldn't the following acheive the same functionality?
    class JustAnAverageClass
    {
       public int Value1;
       public float Value2;
       public string Value3;
     }
    It's also less code to write, less code for the compiler to deal with, as well as faster for other developers to read and comprehend.  So why do most developers think they have to make everything a property with get/set methods?  I'm not sure, other than that they read it somewhere or someone else told them, "this is how you're supposed to do it."  If you ask me, they're just wasting valuable time that would could be better spent developing higher-entropy code that actually does something useful.
     
    Ah, what's that grumbling I hear in the back?  Why not use a code generator to generate all the boilerplate get/set code, thus saving the developer the time normally wasted typing up all the get/set methods?
     
    Doh, wrong answer!  Think about it--should your goal really be generating worthless code faster?  Wouldn't it make more sense to eliminate the that code altogether?
     
    Or am I completely off-base?
     
    Ben
    November 20

    Function entropy

    I have a small confession: my background isn't computer science.

    Actually, it's mechanical engineering.  Throughout college I always figured I'd be an automotive engineer at GM or somewhere, but the U.S. auto industry was not in great shape when I graduated in 1989, and automotive jobs for college kids were a bit on the scare side.  So I ended up interviewing in the commercial nuclear power industry (now there's a growth industry for the last decade of the 20th century!) and during the interview they discovered that I written a ton of software during college (I was really big into computational fluid dynamics).  Next thing I knew asked me if I wanted to join their software deptartment, and within five minutes I had thrown 90% of my college engineering classes out the window.  Whoops!

    For better or worse, I think this has given me a somewhat unique view on software development, in that most of what I learned I learned empirically, instead of just reading it from a book.  The downside of that I was ignorant of a lot of basic computer science theories (3rd normal form?  What are you talking about?!?).

    But, it has allowed me to draw parallels between software engineering and more traditional engineering concepts.  And one of these that I find particularily interesting is attempting to apply the concept of entropy to software.

    Unfortunately, entropy is a very abstract concept, which makes it difficult to define (not to mention comprehend).  For my purposes, I suggest the following definition:

    Software entropy is a measure of change imparted to data by a program.

    Perhaps a few examples would help clarify this.  Here's an example of a function with zero entropy, because it does absolutely nothing:

    float func1(float inputValue)
    {
       return inputValue;
    }
    The following is a low (but non-zero!) entropy function:
    float func2(float inputValue)
    {
       return inputValue + 1;
    }
    Certainly that's not an extremely interesting function, but at least it does something!  Finally, here's a function with a bit more meat associated with it (and therefore, more entropy than the previous examples):
     
    float func3(float inputValue)
    {
       float result = 0;
       float increment = inputValue / 100.0;
     
       for (float x = 0; x < inputValue; float += increment)
          result += increment * ((sin(x) + sin(x + increment)) / 2.0);
     
       return result;
    }
    Now we're talking!  A function that actually does something useful...  well, if using the trapazoid rule to compute the integral of sin(x) from 0 to inputValue is useful for you.  The point is, it does something more powerful (and potentially useful) than the previous examples.
     
    Ok, so that's the basis of software entropy.  Tomorrow I'll delve in a bit deeper and talk about some uses for it.
     
    Ben
    November 13

    Don't forgot about what's important, and what's not

    Sitting throught one of the Sharepoint sessions at the Microsoft Connections conference, I was reminded yet again about how easy it is to lose track of the problem you're trying to solve with software.  Specifically, someone was demonstrating the steps needed to deploy a Sharepoint Feature, which is suprisingly ugly procedure--you have to brew up serveral huge and convoluted XML configuration files (sprinkled with easy-to-remember GUID references throughout [heavy sarcasm intended]), then package everything in a .cab file, then go through bunch of deployment steps.  The presenter spent close to half an hour trying to get everything set up, and in the end the deployment wouldn't work.  Which turned out to be a pretty useful lesson:  if the expert teaching the session can't figure out how to deploy a Sharepoint Feature, what's the chance a Sharepoint novice like myself could successfully deploy it?  Answer:  Absolutely none.

    What was really irking me, though, was what a waste of time the whole deployment procedure was.  Whether a deployment takes 3 minutes or 40 hours makes no difference to the end user--they end up using the same software.  In essense, the Sharepoint Feature deployment model is just wasting the developer's time.  Actually I find it pretty humorous that the ASP.NET group

    Any software solution consists of two categories of code (or configuration files, or resources, or whatever):  1) code that is directly involved in solving the domain problem at hand, and 2) code that doesn't reference the problem, but is needed to allow the software to actually work, a.k.a. housekeeping code.  The thing is, the latter is just wasted time.  It doesn't get the developer any closer to solving the problem for the user.  It's just a necessary evil we have to deal with in order to get the software running.

    I learned about this the hard way, back in 1987 or thereabouts (yeah, I'm old :-p).  At the time, I was hooked on 8086 assember, to the point that I spent an entire year just writing assembler.  At the end of that year, though, I made big realization:  assembler sucked!

    Actually, it was really cool that you could create ultra-fast programs that were tiny--I think I wrote a text-file viewer that ended up around 500 bytes long.  But over that time I started to realize that the vast majority of the time I spent coding was doing stupid stuff like coverting numbers to ASCII characters I could write the the screen or validating input and things like that.  The actual amount of time I spent coding the solution to the problem domain was tiny, certainly under 10%.  The programs themself were efficient, they just weren't an efficient solution to the problem.

    That's why I immediately liked .NET so much--in general they've done a fantastic job of minimizing housekeeping code so you can pretty much jump right into coding solutions to the problem domain.  Contrast that with 10-12 years ago, when you spent a good chunk of time setting up and registering the windows class (RegisterWindowEx(), CS_HREDRAW | CS_VREDRAW, remember all that fun stuff?) and setting up the message handlers.  Now, you just create a project, drop a couple of controls on the form, and you can start coding a solution to the problem at hand.  It's a wonderful thing.

    I guess what really surprises me is that anyone would tolerate any bit housekeeping code at all these days.  But I think the problem is that it's not always obvious when you're writing one type of code versus the other.  After all, they're both are the same computer language, identical syntactically, and both the useful, problem-domain-solving code and low-value housekeeping code are mingled.  It's very easy to think you're creating useful code, when in reality you're just wasting time.

    The moral of the story: be aware of the code you're writing and differentiate when you're writing housekeeping code versus writing code that's actually solving problems.  Try to maximize the former and minimize the latter.  And, if you find yourself spending too much time writing the latter, figure out why that's occuring and correct the situation.

    Ben

    P.S. If it turns out the reason for all your housekeeping efforts is a generic, reusable framework that's supposed to improve productivity, please throw in a small chuckle on my behalf ;)

    November 09

    Flipping the bozo bit

    Some years back I worked with a guy named Todd Wyder, who was an interesting fellow in many ways.  One of the things he used to say was, "don't flip the bozo bit."  At first I had no clue what he was talking about, but he explained the bozo bit as a boolean flag that you'd turn on for someone you determined was a bozo, e.g. any ideas they come up weren't worth listening to.  Usually this occurs when someone says something profoundly stupid--"You know, I was thinking about it, and I really do think the world is flat!".  Todd's point was that people come up with both stupid and clever ideas, and if you tune someone out because they made a dumb statement once, you'll miss out on the clever ideas.
     
    Nonetheless, I flipped the bozo bit on someone Tuesday afternoon.  :-o
     
    It was the last session of the day here, and the session I was in turned out to be object-relational mapping (and I have a lot more to say on that subject).  About 2/3rds through the session a guy raised his hand and asked, "Can we utilize lazy load with this?"
     
    To me, that was the same as him standing up and saying, "Hi, I'm clueless when it comes to building large systems!  I like using unscaleable patterns!"
     
    What was even more funny (or sad, depending how you look at it), was the number of people in the room who nodded their head in agreement with his question.
     
    I used to use lazy load.  I thought it was a good design paradigm, and told others to use it.  I thought it improved efficiency and improved perfomance.
     
    Then I started building larger systems.
     
    One day, on a larger project, the DBA came up to me and told me that the SQL database was processing an inordinate number of calls for the current user load. What's more, he started looking at the queries and noticed something strange:
    DBA: "Ben, the database server is really bogging down, it's getting hit with way more queries than I'd expect for the current user load."
    Ben: "Hmmm..."
    DBA: "I looked into the queries more, and it's bizarre--I'm seeing multiple queries hitting the same row of data repeatedly."
    Ben: "Uh-huh"
    DBA: "They're all occuring within a really short timeframe.  Could there be a problem with the app where it's issuing the same query more than once?  It would be much more efficient to just query the row once, you know."
    Ben: "No, no, that's how the application was designed.  It improves performance"
    DBA: "Umm, well.... you've achieved the opposite."
    Ben: "Oh-oh"
    Moral of the story: in the quest to improve performance, lazy load in reality increases the number of SQL statements that the database server has to process, ultimately reducing performance.  Sadly, though, many people never get to build a large enough system to realize that lazy load is a bad thing.
     
    Oh, well.
     
    Ben
    November 08

    My conversation with Dino Esposito

    Yesterday night I spotted Dino Esposito standing around at the Connections conference, so I walked over to him to say hello.  Dino has writes (among other things) an excellent MSDN column.  What I like best about him is that his articles are usually simple enough to grasp quickly, but have enough "meat" in them such that you can understand how you would incororate the concepts in projects you're working on.  I run into far too many articles (and books, for that matter), that are garbage--they explain the concepts, but the examples are so trivial or contrived that you can't extend them to large-scale applications.
    Me: "Dino, I have to thank you for all the great columns you've written, I can't tell you how much they've helped me over the years."
     
    Dino smiles and says, "Thanks"
     
    Me: "But here's the real question:  How is Ferrari going to do next year with Raikonnen??"
     
    Dino's smile disappeared, and is replaced with the best "for-the-love-of-God-what-the-hell-are-you-talking-about" expression I've ever seen on his face.  "Huh???" he replies.
     
    Me:  You know, Formula 1, Ferrari.  Forget all this computer garbage, this is the important stuff!  Schumacher's retired this year, Kimi Raikonnen is replacing him.  How is Ferrari going to do next year?
     
    Dino:  "I'm not really a big fan of Formula 1"
    You win some, you lose some.
     
    Ben