Sunday, September 6, 2009

Good programming practices

While conventions and standards introduce uniformity to code and make it easily understandable, developers are strictly advised to follow some good coding practices listed over a period of time. If you browse the Internet, you will find a vast number of resources available on good programming practices. Some of them have stood with time and are always relevant, some are not really significant and some do not go along with our conventions. However, there are a few points that are relevant to our organization and to the general C# developers; and applicable to the code we write regularly.

Today, we will have a look at such “good programming practices”:

Variables
  • Declare one variable per line. Do not list multiple variables in the same line although they are the same type. By declaring one variable per line, you can easily add comment for each variable, which you should (must for member variables).
  • Declare and initialize variables in the same order they are used. This applies to both member and local variables.
  • Always initialize variables, if possible at the point of declaration. There should a valid reason if a variable is not initialized.
  • Declare and initialize variables close to where they are used. This is very important because only by identifying their scope of usage, one can decide which variables should stay local and which variables are required as member variables. The idea is to declare and use variables as locally as possible. By declaring variable close to their usage, however, does not mean that variables should be declared anywhere. In methods, the variable declarations should be on the top.
  • Do not make member variables public or protected. Keep them private and expose them through public/protected properties.
  • In a method, use “this.” for member variables. This way, member variables can be easily distinguished from local ones.

Constants
  • Constants should be used instead of hardcoded numbers in code. Declare and initialize constants at the top and use them throughout the code.
  • Remember, the rule for variables about one declaration per line applies to constants as well.

Session variables
  • Use session (or even application) variables sparingly. Session variables should be used only when some data has to be stored throughout the session of the user. Never use them for temporary storage purposes.
  • Do not store large objects in session variables. 
  • Do not use session variables throughout the code. Use session variables only within the classes and expose methods to access the value stored in the session variables.

Control flow (branches and loops)
  • All flow control primitives (if, else, while, do, for, switch) should be followed by a block (a pair of curly braces) even if it is empty. Remember to put a curly brace on a new line always.
  • All switch statements should have a default label as the last case label. 
  • Convert strings to lowercase or uppercase before comparing. It is important to bring string on both sides of a condition to one case before performing comparisons.
  • Do not make explicit comparisons to boolean values: true or false. 
  • Use ternary operator for simple if-else statements. 
  • Use String.Empty instead of “” while checking if a string is empty.
  • Do not compare floating points using == or != operators. 
  • Use StringBuilder class instead of String when you have to manipulate string objects in a loop.
  • When casting types, always check if there is a possibility of loss of precision.

Methods
  • Name a method so that it tells what it does. It saves you from writing lot of comments. For example, instead of naming a method that saves a phone number as SaveData (string phoneNumber), use SavePhoneNumber (string phoneNumber).
  • Do not create two methods with names that differ only by case (although their purpose or scope is different). This applies to all objects including namespaces, classed, variables etc.
  • A method should not have only one return statement. Avoid multiple or conditional return statements.
  • Never return null for an empty collection.
  • All variants of an overloaded method should be used for the same purpose and have the same behavior.

Events and delegates
  • Always raise events through a protected virtual method.
  • Always use the sender/arguments signature for event handlers.
  • Do not programmatically call an event handler. Instead, code the action in a separate method and call the method.

Some general tips
  • Avoid fully qualified type names. Use the using statement instead.
  • Do not hardcode strings. Use resource files.
  • Always use external style sheet to control the look and feel of the pages (even for images).
  • Keep name of querystring arguments short (2-3 characters). For e.g., use “fn=test.txt&v=1.1” instead of “filename=test.txt&fileversion=1.1”.
  • If you have opened database connections, sockets, file stream etc, always close them in the finally block.
  • Always set a reference field to null after usage to tell the Garbage Collector that the object is no longer needed. 
  • Avoid implementing a destructor. If a destructor is needed, also use GC.SupressFinalize.

Code structure: Size and layout

How many times have you come across a source file that is so huge in length that to trace a part of code in it can get you sick, not only because there is too much of code to dig into but also because the scroller is too small to grab? ;-)

It is important to know when it is enough!

  • If a method exceeds beyond 30-40 lines, you should consider splitting into further segments. Rarely business logic has to be written above 40 lines without an option to divide it into segments. As long as dividing the code into methods makes sense, it does not matter. Compartmentalized code only makes it readable and easier to understand.
  • Watch the number of arguments that a methods has. If the list of arguments exceeds 5-6, think again. May be, you can use a structure as class for it. You don’t want to have a terrible argument list as the COM methods in Office Automation, do you?
  • 300-400 lines of code in a file already make the file huge. If a source file exceeds this limit, you must be missing something. Check if a new class can be introduced or your code is not optimized enough.

Layout of code

Just like a blueprint for a building or an editor’s draft of a book, the code layout is very important. When you are using Word or any similar editor, have you not noticed how many option are available to format the text (with options for headings, subheadings, emphasis and much more; we call them markup)? What is the advantage of a good layout or formatting of code? You make the code readable and help others to easily maintain it when needed.

How easy it is to understand a statement like this:

querystring="name="+userName+"&userid="+userID+"&email="+email;

For developers who are familiar with Visual Basic (that uses & operator to concatenate) and C# (that uses + operator to concatenate), this statement must hold you for a while before you figure out which operator is used to concatenate the string.

Now what about this:

querystring = "name=" + userName + "&userid=" + userID + "&email=" + email;

It is the same statement, but the concatenation operator (+ in this case) is so distinct compared to earlier example.

Let us talk about the standards for code layout:

  • Always indent code. Use tabs to indent code, not spaces.
    There is enough argument on the web on whether to use spaces or tabs to indent. Using tabs can be troublesome if different developers use different length of spacing for tabs, while using spaces has always been a tedious job. However, when all developers are using the same editor (everyone in our organization uses Visual Studio 2005 or 2008 for development in C# with default setting of tab at 4 characters), using tabs is the best way to indent.

  • Never leave more than one blank line between statements.
    Blank lines should be used to separate units as methods, and also separate different segments of code for clarity. However, it is never necessary to use more than one blank line anywhere in code. Avoid using more than one blank line, but do not forget to use blank lines completely. You must use a single blank line always to separate methods and logical segments of code.

  • Know when to use spaces and when not.
    • Always use a space between an operator and an operand. However, in case of unary, increment or decrement operator, do not use a space between the operator and the operand.
    • For flow control primitives (if, else, while, do, for, switch), always put a space between the keyword and the starting bracket. But do not use a space between bracket and text inside a bracket.
    • Always follow a comma or a semicolon with a space.
    • Do not use a space between the methods name and bracket in method while defining or calling a method.
    • Do not use a space within square brackets for subscripts.

 Bad:
if (x==y)
for(i=0;i<10;i++)
BuildSampleString ( myChar, 0, 1 );
x = dataArray[ index ];
Good:
if (x == y)
for (i = 0; i < 10; i++)
BuildSampleString(myChar, 0, 1);
x = dataArray[index];

  • Always put curly braces on a separate line.
    For example:

    class MyClass {
       …
    }

    or

    if (x == y) {
       …
    }

    should be written as:

    class MyClass
    {
       …
    }

    and

    if (x == y)
    {
       …
    }

It takes time for developers to get into this habit if they already are not following this. But with IDEs like Visual Studio helping developers with auto indentation and spacing, it is not very difficult for developers now. So, no excuses! ;-)

Code structure

The standardization of the structure of code is important to allow developers locate objects quickly, trace bugs more effectively and not the least, make the code readable and explanatory. Let us talk about the structure of code today.

I checked multiple sources from Microsoft and also scanned through the code gallery at MSDN to study how people organized their code. I did not find uniformity of order in MSDN samples and other code from the very developers from Microsoft. Some code in gallery even declare member variables at the end of the class while some code use underscore for member variables, which we have strictly avoided. At some point, I thought, we have no rigid reference to stick to, but we will create our standards as close as possible. In fact, I had already speculated that as long as the order of code was followed throughout the application, it hardly mattered which order was put into practice. But I was curiously looking for a scientific order that had explanation. For example, to have events and public methods above private methods makes sense because it helps quicker maintenance, most of the times. There is no scientific backup to support this, but if you work on maintenance of a source for a while, you should feel the same.

First and foremost, "one file, one class" policy is a must. There should not be more than one class in one file. Multiple classes not only tend to make the file size big but also complicate tracing and maintenance. Hence, there should always be one class in a file and the file name according to the class. It is suggested than even if there are objects like enumerators of a namespace level scope, use a separate class for them.

In a source file (C#), the following order should be followed for its contents:

// Copyright information
// File details
// Change log (version history)

Using statements (imports) 

Namespace
{
     Class
     {
          #region Enumerators (of class level scope) 

          #region Static Methods 

          #region Constants 

          #region Member Variables 

          #region Properties 

          #region Constructors and Destructors 

          #region Events 

          #region Public Methods 

          #region Protected Methods 

          #region Private Methods
     }
}

Whenever there is question about ordering based on access specifiers, the ordering should always be done in order of: public, protected and private. Hence, if required to classify events based on access specifiers, public events should come first followed by protected and then private ones.

I am already drowsing now, its past midnight. I will limit today’s discussion on code structure to the order of code. Tomorrow, we will discuss sizing code and more. Goodnight!

Coding convention – Database and files

In the last blog, we stopped at the naming convention for C# code. However, it is not only C# source that forms an application. We need to standardize naming convention for database objects, files and folders etc. We will focus on them in this blog.

Naming convention for database

The table below is a chart with the standards to be followed for database objects. As in the previous blog for source code, we have three columns here. The first column has the database unit, the second has the details about which case to be used (and special conditions, if any) and last column has an example to illustrate.

Object
      Convention
Database
  • Use Pascal case
  • No abbreviation
  • Format:
    {CompanyName}[_Scope][_Project]

    Example(s):
    MyCompany_Application_Main
Table
  • Use Pascal case
  • Use plural name (for junction tables, concatenate name in one to many relationship order)
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    Users
    UserRoles

Column
  • Use Pascal case
  • No abbreviation or underscore
  • Primary key field should be named as singular form of table name + "ID" (e.g. UserID for table Users)
  • Boolean fields should have prefix as "Is", "Has" etc
  • DateTime fields should have "Date" or "Time" or both as suffix

    Example(s):
    UserID
    FirstName
    IsActive
    JoiningDate

Index
  • Use Pascal case
  • Format:
    {TableName}_{ColumnName}_{U/N}{C/N}
    (U/N for unique or not and C/N for clustered or not)
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    TaskDetails_IsCurrent_NN

Constraint
  • Use Pascal case
  • Format:
    {Type}{TableName}_{ColumnName}
    (Type: Pk for Primary key, Fk for Foreign key, Ck for Check and Un for Unique)

    Example(s):
    PkUsers_UserId

View
  • Use Camel case
  • No abbreviation or underscore
  • Prefix "vw"

    Example(s):
    vwUserRoles

Stored Procedure
  • Use Camel case
  • No abbreviation or underscore
  • Prefix "sp"

    Example(s):
    spDeactivateOldEmployees

Trigger
  • Use Pascal  case
  • Format:
    {TableName}[_Description]_{Type}
    (Type: U for Update, D for Delete and I for Insert)

    Example(s):
    Users_U
    Salary_CalculateTotal_U

Variable
  • Use Pascal case
  • No abbreviation or underscore
  • For boolean type, use prefix as "Is", "Has" etc.

    Example(s):
    @MaximumAge


Let us look at naming files and folders now.

Naming convention for files and folders

Files and folders should be named uniformly throughout the application. This is very important because name can identify the type, purpose, scope and validity of a file in an application. The table below presents the standards.

Type
      Convention
Folder
  • Use Pascal case
  • No abbreviation (except for system standards)
  • Use underscore sparingly (avoid as much as possible)
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    Content
    App_Code

Code (Source) File
  • Use Pascal case
  • Use the same name as the main class
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    Users.aspx.cs
    CalculateInvoice.cs

Backup File
  • Format:
    {Date}_{Initials}_{FileName}
    (Date in format yyyyMMdd and Initials of the file creator)

    Example(s):
    20090905_NY_CalculateInvoice.cs

Log File
  • Format:
    {Date[Time]}{Description}.log.{Extension}
    (Date[Time] can be in format: yyyy[MM[dd[HH[mm[ss]]]]])

    Example(s):
    200905MonthlyImport.log.xls
    20090905142530System.log.txt

DLL Assembly
  • Use Pascal case
  • Use the same name as the containing namespace

    Example(s):
    SecurityTools.dll


We will also discuss naming convention for style tags and classes soon. However, a quick mention about HTML code is relevant here. In HTML, use XML standards to ensure compatibility with newer versions of HTML and XHTML. Primarily, use Pascal case for all elements and Camel case for their attributes, unless required otherwise by the environment.

Saturday, September 5, 2009

Coding convention – C#: Naming convention

Naming convention is the most important part of standard coding convention. By following a common standard to name identifiers, a common medium of communication is established between the developers. Identifiers are easily recognizable along with their context if a developer knows the standard. Naming convention deals with naming almost every programming objects starting from basic variables to large structures as classes and namespaces.

Before listing the standards, it is necessary to understand three basic terms used in this context:

Pascal case

The first letter of each word in the name is in uppercase and the rest of the word in lowercase. If a name consists of more than one word, the first letter of each distinct word should be in uppercase.

For example: BlackColorCode
This name constitutes of three words ‘black’, ‘color’ and ‘code’, hence their first letters ‘B’, ‘C’ and ‘C’ are in uppercase respectively.

Camel case

All letters of the first word in the name is in lowercase. If a name consists of more than one word, the first letter of each distinct word (except the first word) should be in uppercase.

For example: blackColorCode
This name constitutes of two words ‘black’, ‘color’ and ‘code’. All letters of first word ‘black’ is in lowercase. However the following words (in this case ‘color’ and ‘code’) have their first letters, both ‘C’, in uppercase.

Hungarian notation

Hungarian notion is a notation used to name objects by prefixing their type in the name.

For example: strBlackColorCode
It identifies the variable as a string type, where ‘str’ is an abbreviation used for string type. With typed languages like C#, Microsoft has chosen to get rid of Hungarian notation and so shall we. We will use Hungarian notations sparingly (almost try not to use at all).

Naming convention

In the table below, let us have a look at how to name various programming units in C#. The first column has the programming unit, the second has the details about which case to be used (and special conditions, if any) and last column has an example to illustrate.

Identifier
      Convention
Namespace
  • Use Pascal case
  • No abbreviation or underscore
  • Format:
    {CompanyName}.{Technology}[.Format][.Design]

    Example(s):
    MyCompany.Service.Data

Class
  • Use Pascal case
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation; however and Exception class should have a suffix “Exception”)

    Example(s):
    FileLogger

Interface
  • Use Pascal case
  • No abbreviation or underscore
  • Name should be same as the implementing default class
  • Prefix character ‘I’ to indicate an interface

    Example(s):
    IFileLogger

Enumerator
(Type and Values)
  • Use Pascal case for both type and values
  • Use singular name for types (exception: use plural for type representing bitfields)
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    LoggingLevel
    High, Medium, Low
Property
  • Use Pascal case
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)
  • Name of a property and underlying type should be same (only differing in case)

    Example(s):
    ListKey
Variable
  • Use Camel case
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    firstName
Constant
  • All letters in uppercase
  • No abbreviation
  • Use underscore to separate words (or to identify groups)
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    MATURITY_AGE

Method
  • Use Pascal case
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    LoadProjects

Parameter
  • Use Camel case
  • No abbreviation or underscore
  • No prefix or suffix (avoid Hungarian notation)

    Example(s):
    fileName

Event
  • Use Pascal case
  • No abbreviation
  • Suffix “EventHandler” for event handlers and “EventArgs” for event arguments
  • Method must have two arguments: sender as the object that raised the event and
    e as the appropriate event class
  • Name events with a verb (-ing form for pre-events and –ed form for post-events)

    Example(s):
    SaveButton_Clicked
    MyPanel_Painting
Attribute
  • Use Camel case
  • No abbreviation or underscore
  • Suffix the word “Attribute”

    Example(s):
    ObsoleteAttribute


Besides the standards mentioned in the chart above, the following should be taken care of:
  • Do not declare two identifiers with same name only differing in case.
  • Name an identifier according to its meaning not its type.
  • Always add “EventHandler“ to delegates related to events and “Callback” to delegates related to callback methods.

Coding convention - C#

The importance of a standard coding convention is often ignored in many small and medium scale software companies. Most of the times, they make an effort to establish a convention but due to limited manpower and in absence of departmentalized role of employees, initial efforts either fade midway or the convention gets limited to documents. When there are developers coding, there should always be a mechanism to monitor their code. Like every other discipline, coding needs a constant monitor and review too.

Upon analyzing the code written over a number of years in our own organization, I decided to streamline the process and with reference from a number of guidelines used including the very own guidelines by Microsoft, I compiled a coding convention document for C#, to start with. I have also developed an easy and quick tool for developers to verify their own code against the convention. As developers have now started using the tool and it is constantly recording the data, I will be working on the reports soon using which it will be possible to crosscheck how effective this implementation has been.

I thought, maybe it is a good idea to share the details of my presentation and post the convention with possible explanation on this blog. It might be of some use to people looking for establishing their own standards. Hence I will be posting the details with explanation over this and next blogs.

Why coding convention?

If you are a developer, has any of these every happened to you?

Who wrote this code? What? It was me? ...But when?
Before I can fix anything, I have to contact the developer who wrote this code, to have this explained.
Does this code even work?
Where does this code start? And where does it end?

Well, based on what the software industry has to say, more than 80% of a software life cycle goes to maintenance. As Sun Microsystems puts it, hardly any software is maintained by its original author for its whole lifetime.

Many developers work on one project, even one module. Every developer has his/her own coding style. To bring uniformity and to get rid of the situations mentioned in italics above, it is important to set standards and ensure that they are followed. Following standards makes a huge positive impact on maintenance of projects.

The objective of setting coding standards is to have a positive impact on:
  • Avoidance of errors/bugs.
  • Maintainability, by implementing proven design principles and introducing uniformity of style.
  • Performance, by eradicating poor programming practices. 

We will start with naming conventions in next blog.