Introduction
In my previous three articles on CodeProject.com, I have explained the fundamentals of Windows Communication Foundation (WCF), including:
If you have followed those three articles closely, you should be able to work with WCF now. Within the last two articles, I have explained how to utilize LINQ and the Entity Framework with WCF, so by now you should also be able to work with LINQ and EF. However, you may not fully understand LINQ and EF by just reading those two articles. So here I come, to explain all the fundamentals of LINQ and EF.
In addition to LINQ and EF, some people may still be using LINQ to SQL, which is the first ORM product from Microsoft, or a by-product of the C# team, or a simplified version of EF, or whatever you think and say it is. As LINQ to SQL (L2S) is so easy to work with, I will also write some articles to explain it.
Having said so, in the future, I will write the following five articles to explain LINQ, LINQ to SQL, and EF:
- Introducing LINQ-Language Integrated Query (this article)
- LINQ to SQL: Basic Concepts and Features (next article)
- LINQ to SQL: Advanced Concepts and Features (future article)
- LINQ to Entities: Basic Concepts and Features (future article)
- LINQ to Entities: Advanced Concepts and Features (future article)
After finishing these five articles, I will come back to write some more articles on WCF from my real work experience, which will be definitely helpful for your real world work, if you are using WCF right now.
In this article, I will cover the following topics:
- What is LINQ
- New data type
var
- Automatic properties
- Object initializer and Collection initializer
- Anonymous types
- Extension Methods
- Lambda expressions
- Built-in LINQ Extension Methods and method syntax
- LINQ query syntax and query expression
- Built-in LINQ operators
What is LINQ
Language-Integrated Query (LINQ) is a set of extensions to the .NET Framework that encompass language-integrated query, set, and transform operations. It extends C# and Visual Basic with native language syntax for queries and provides class libraries to take advantage of these capabilities.
Let us see an example first. Suppose there is a list of integers like this:
Collapse | Copy CodeList<int> list = new List<int>() { 1, 2, 3, 4, 5, 6, 100 };To find all the even numbers in this list, you might write code like this:
Collapse | Copy CodeList<int> list1 = new List<int>();
foreach (var num in list)
{
if (num % 2 == 0)
list1.Add(num);
}Now with LINQ, you can select all of the even numbers from this list and assign the query result to a variable, in just one sentence, like this:
Collapse | Copy Codevar list2 = from number in list
where number % 2 == 0
select number;In this example,
list2 and
list1 are equivalent.
list2 contains the same numbers as
list1 does. As you can see, you don't write a
foreach loop. Instead, you write a SQL statement.
But what do
from,
where, and
select mean here? Where are they defined? How and when can you use them? Let us start the exploration now.
Creating the test solution and project
To show these LINQ-related new features, we will need a test project to demonstrate what they are and how to use them. So we first need to create the test solution and the project.
Follow these steps to create the solution and the project:
- Start Visual Studio 2010.
- Select menu option File | New | Project... to create a new solution.
- In the New Project window, select Visual C# | Console Application as the Template.
- Enter TestLINQ as the Solution Name, and TestNewFeaturesApp as the (project) Name.
- Click OK to create the solution and the project.
New data type var
The first new feature that is very important for LINQ is the new data type
var. This is a new keyword that can be used to declare a variable, and this variable can be initialized to any valid C# data.
In the C# 3.0 specification, such variables are called implicitly-typed local variables.
A
var variable must be initialized when it is declared. The compile-time type of the initializer expression must not be of
null type, but the run time expression can be
null. Once it is initialized, its data type is fixed to the type of the initial data.
The following statements are valid uses of the
var keyword:
Collapse | Copy Codevar x = "1";
var n = 0;
string s = "string";
var s2 = s;
s2 = null;
string s3 = null;
var s4 = s3;
At compile time, the above
var statements are compiled to IL like this:
Collapse | Copy Codestring x = "1";
int n = 0;
string s2 = s;
string s4 = s3;
The
var keyword is only meaningful to the Visual Studio compiler. The compiled assembly is actually a valid .NET 2.0 assembly. It doesn't need any special instructions or libraries to support this feature.
The following statements are invalid usages of the
var keyword:
Collapse | Copy Codevar v;
var nu = null;
var v2 = "12"; v2 = 3;
The first one is illegal because it doesn't have an initializer.
The second one initializes the variable
nu to
null which is not allowed, although once defined, a
var type variable can be assigned
null. If you think that at compile time, the compiler needs to create a variable using this type of initializer, then you understand why the initializer can't be null at compile time.
The third one is illegal because once defined, an integer can't be converted to a string implicitly (
v2 is of type
string).
Automatic properties
In the past, for a class member, if we wanted to define it as a property member, we had to define a private member variable first. For example, for the
Product class, we can define a property
ProductName as follows:
Collapse | Copy Codeprivate string productName;
public string ProductName
{
get { return productName; }
set { productName = value; }
}This may be useful if we need to add some logic inside the get/set methods. But if we don't need to, the above format gets tedious, especially if there are many members.
Now, with C# 3.0 and above, the above property can be simplified in one statement:
Collapse | Copy Codepublic string ProductName { get; set; }When Visual Studio compiles this statement, it will automatically create a private member variable
productName and use the old style's get/set methods to define the property. This could save lots of typing.
Just as with the new type
var, the automatic properties are only meaningful to the Visual Studio compiler. The compiled assembly is actually a valid .NET 2.0 assembly.
Interestingly, later on, if you find you need to add logic to the get/set methods, you can still convert this automatic property to the old style's property.
Now, let us create this class in the test project:
Collapse | Copy Codepublic class Product
{
public int ProductID { get; set; }
public string ProductName { get; set; }
public decimal UnitPrice { get; set; }
}We can put this class inside the
Program.cs file, within the namespace
TestNewFeaturesApp. We will use this class throughout this article, to test C# features related to LINQ.
Object initializer
In the past, we couldn't initialize an object without using a constructor. For example, we could create and initialize a
Product object like this, if the
Product class has a constructor with three parameters:
Collapse | Copy CodeProduct p = new product(1, "first candy", 100.0);
Or, we could create the object, and then initialize it later, like this:
Collapse | Copy CodeProduct p = new Product();
p.ProductID = 1;
p.ProductName = "first candy";
p.UnitPrice=(decimal)100.0;
Now with the new object initializer feature, we can do it as follows:
Collapse | Copy CodeProduct product = new Product
{
ProductID = 1,
ProductName = "first candy",
UnitPrice = (decimal)100.0
};At compile time, the compiler will automatically insert the necessary property setter code. So again, this new feature is a Visual Studio compiler feature. The compiled assembly is actually a valid .NET 2.0 assembly.
We can also define and initialize a variable with an array like this:
Collapse | Copy Codevar arr = new[] { 1, 10, 20, 30 };This array is called an implicitly typed array.
Collection initializer
Similar to the object initializer, we can also initialize a collection when we declare it, like this:
Collapse | Copy CodeList products = new List {
new Product {
ProductID = 1,
ProductName = "first candy",
UnitPrice = (decimal)10.0 },
new Product {
ProductID = 2,
ProductName = "second candy",
UnitPrice = (decimal)35.0 },
new Product {
ProductID = 3,
ProductName = "first vegetable",
UnitPrice = (decimal)6.0 },
new Product {
ProductID = 4,
ProductName = "second vegetable",
UnitPrice = (decimal)15.0 },
new Product {
ProductID = 5,
ProductName = "another product",
UnitPrice = (decimal)55.0 }
};
Here, we created a list and initialized it with five new products. For each new product, we used the object initializer to initialize its value.
Just as with the object initializer, this new feature, collection initializer, is also a Visual Studio compiler feature, and the compiled assembly is a valid .NET 2.0 assembly.
Anonymous types
With the new feature of the object initializer, and the new
var data type, we can create anonymous data types easily in C# 3.0.
For example, if we define a variable like this:
Collapse | Copy Codevar a = new { Name = "name1", Address = "address1" };At compile time, the compiler will actually create an anonymous type as follows:
Collapse | Copy Codeclass __Anonymous1
{
private string name;
private string address;
public string Name {
get{
return name;
}
set {
name=value
}
}
public string Address {
get{
return address;
}
set{
address=value;
}
}
}The name of the anonymous type is automatically generated by the compiler, and cannot be referenced in the program text.
If two anonymous types have the same members with the same data types in their initializers, then these two variables have the same types. For example, if there is another variable defined like this:
Collapse | Copy Codevar b = new { Name = "name2", Address = "address2" };Then we can assign
a to
b like this:
Collapse | Copy Codeb = a;
The anonymous type is particularly useful for LINQ when the result of LINQ can be shaped to be whatever you like. We will give more examples of this when we discuss LINQ.
As mentioned earlier, this new feature is again a Visual Studio compiler feature, and the compiled assembly is a valid .NET 2.0 assembly.
Extension Methods
Extension Methods are static methods that can be invoked using the instance method syntax. In effect, Extension Methods make it possible for us to extend existing types and construct types with additional methods.
For example, we can define an Extension Method as follows:
Collapse | Copy Codepublic static class MyExtensions
{
public static bool IsCandy(this Product p)
{
if (p.ProductName.IndexOf("candy") >= 0)
return true;
else
return false;
}
}In this example, the static method
IsCandy takes a
this parameter of
Product type, and searches for the word
candy inside the product name. If it finds a match, it assumes this is a candy product and returns
true. Otherwise, it returns
false, meaning this is not a candy product.
Since all Extension Methods must be defined in top level static classes, to simplify the example, we put this class inside the same namespace as our main test application, TestNewFeaturesApp, and make this class on the same level as the
Program class so it is a top level class. Now, in the program, we can call this Extension Method like this:
Collapse | Copy Codeif (product.IsCandy())
Console.WriteLine("yes, it is a candy");
else
Console.WriteLine("no, it is not a candy");It looks as if
IsCandy is a real instance method of the
Product class. Actually, it is a real method of the
Product class, but it is not defined inside the
Product class. Instead, it is defined in another static class, to extend the functionality of the
Product class. This is why it is called an Extension Method.
Not only does it look like a real instance method, but this new Extension Method actually pops up when a dot is typed following the product variable. The following image shows the intellisense of the
product variable within Visual Studio.

Under the hood in Visual Studio, when a method call on an instance is being compiled, the compiler first checks to see if there is an instance method in the class for this method. If there is no matching instance method, it looks for an imported static class, or any static class within the same namespace. It also searches for an extension method with the first parameter that is the same as the instance type (or is a super type of the instance type). If it finds a match, the compiler will call that extension method. This means that instance methods take precedence over Extension Methods, and Extension Methods that are imported in inner namespace declarations take precedence over Extension Methods that are imported in outer namespaces.
In our example, when
product.IsCandy() is being compiled, the compiler first checks the
Product class and doesn't find a method named
IsCandy. It then searches the static class
MyExtensions, and finds an Extension Method with the name
IsCandy and with a first parameter of type
Product.
At compile time, the compiler actually changes
product.IsCandy() to this call:
Collapse | Copy CodeMyExtensions.IsCandy(product)
Surprisingly, Extension Methods can be defined for
sealed classes. In our example, you can change the
Product class to be
sealed and it still runs without any problem. This gives us great flexibility to extend system types, because many of the system types are
sealed. On the other hand, Extension Methods are less discoverable and are harder to maintain, so they should be used with great caution. If your requirements can be achieved with an instance method, you should not define an Extension Method to do the same work.
Not surprisingly, this new feature is again a Visual Studio compiler feature, and the compiled assembly is a valid .NET 2.0 assembly.
Extension Methods are the bases of LINQ. We will discuss the various Extension Methods defined by .NET 3.5 in the namespace
System.Linq, later.
Now, the
Program.cs file should be like this:
Collapse | Copy Codeusing System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace TestNewFeaturesApp
{
class Program
{
static void Main(string[] args)
{
var x = "1";
var n = 0;
string s = "string";
var s2 = s;
s2 = null;
string s3 = null;
var s4 = s3;
Product product = new Product
{
ProductID = 1,
ProductName = "first candy",
UnitPrice = (decimal)100.0
};
var arr = new[] { 1, 10, 20, 30 };
List products = new List {
new Product {
ProductID = 1,
ProductName = "first candy",
UnitPrice = (decimal)10.0 },
new Product {
ProductID = 2,
ProductName = "second candy",
UnitPrice = (decimal)35.0 },
new Product {
ProductID = 3,
ProductName = "first vegetable",
UnitPrice = (decimal)6.0 },
new Product {
ProductID = 4,
ProductName = "second vegetable",
UnitPrice = (decimal)15.0 },
new Product {
ProductID = 5,
ProductName = "third product",
UnitPrice = (decimal)55.0 }
};
var a = new { Name = "name1", Address = "address1" };
var b = new { Name = "name2", Address = "address2" };
b = a;
if (product.IsCandy()) Console.WriteLine("yes, it is a candy");
else
Console.WriteLine("no, it is not a candy");
}
}
public sealed class Product
{
public int ProductID { get; set; }
public string ProductName { get; set; }
public decimal UnitPrice { get; set; }
}
public static class MyExtensions
{
public static bool IsCandy(this Product p)
{
if (p.ProductName.IndexOf("candy") >= 0)
return true;
else
return false;
}
}
}So far in
Program.cs, we have:
- Defined several
var type variables
- Defined a
sealed class Product
- Created a product with the name of "first candy"
- Created a product list containing five products
- Defined a static class, and added a static method
IsCandy with a this parameter of type Product, to make this method an Extension Method
- Called the Extension Method on the candy product, and printed out a message according to its name
If you run the program, the output will look like this:
Lambda expressions
With the C# 3.0 new Extension Method feature, and the C# 2.0 new anonymous method (or inline method) feature, Visual Studio has introduced a new expression called lambda expression.
Lambda expression is actually a syntax change for anonymous methods. It is just a new way of writing anonymous methods. Next, let's see what a lambda expression is step by step.
First, in C# 3.0, there is a new generic delegate type,
Func, which presents a function taking an argument of type
A, and returns a value of type
R:
Collapse | Copy Codedelegate R Func (A Arg);
In fact, there are several overloaded versions of
Func, of which
Func is one.
Now, we will use this new generic delegate type to define an extension:
Collapse | Copy Codepublic static IEnumerable Get(this IEnumerable source, Funcbool