Wednesday, September 19, 2012
This property is not available on SQL Server 7.0
You're likely getting this error because you're trying to interface with SQL Server 2012 through the version 10.0 (2008) of the SMO library -- SQL is backwards compatible, not forwards compatible.
Ensure that you are compiling against the version (11.0) 2012 SMO libraries (can be found in the installation SDK directory), and that if you include any SMO libraries with your installation that they are also the version 11.0 binaries.
Tuesday, September 4, 2012
Bulk Inserts with the Entity Framework
A bulk insert is a SQL command that allows you to insert a large number (100s or 1000s) of records for a single table all at once (similar to an Oracle prepared statement). Without a bulk insert, the records would have to be inserted one at a time and incur the expense of many SQL commands as opposed to just one bulk insert command.
As of version 5.0 of the Entity Framework (EF), there is still no support for bulk inserts. If records are to be inserted in bulk, it will have to be done outside the EF. There are many examples online on how to do this with an EF context and the SqlBulkCopy object. However, they assume that the entire SQL transaction will consist of nothing but the records in the bulk insert. In reality, transactions are almost always going to involve inserting records into more than just one table. Also, you'll likely want to use bulk insert when it improves performance, but continue to use EF for adding/updating other objects because it is so much more convenient. This article shows how to accomplish this using the EF.
NOTE: consider disabling lazy loading in conjunction with the below technique. Lazy-loading (combined with the EF "fixup" feature) has serious performance implications, especially when large numbers or objects are involved.
For example, imagine we are creating a business function that inserts 1 object of type Customer and 10000 of type Order. To get better performance, we'll use a bulk insert for the Order objects:
This will "work", but it definitely will not be transactional. Heck, it won't even all occur using the same SQL connection! This is because SqlBulkCopy is given a connection string, not a connection - so it won't share a connection with the context. Second, SqlBulkCopy will begin and end its own transaction, as will the call to SaveChanges(). This means that if there is a failure the Customer might be inserted but not the Orders, or vice versa.
To make this transactional we must (unfortunately) apply several reflection and casting hacks in the process of performing the following steps:
This technique is limited in that objects you are bulk inserting are not tracked by the EF. This means that the EF is unaware of any object that is bulk inserted which means there is the potential for conflicts if the same object is both bulk inserted and inserted via AddObject(). Also, SqlBulkCopy will not use any of the EF's navigational properties because they have no meaning to it. You must instead ensure all required scalar properties are initialized.
If the limitations are not an issue for you or easy to avoid (which in general they should be), this technique is useful for improving the efficiency of any business function that requires the insertion of large numbers of records.
As of version 5.0 of the Entity Framework (EF), there is still no support for bulk inserts. If records are to be inserted in bulk, it will have to be done outside the EF. There are many examples online on how to do this with an EF context and the SqlBulkCopy object. However, they assume that the entire SQL transaction will consist of nothing but the records in the bulk insert. In reality, transactions are almost always going to involve inserting records into more than just one table. Also, you'll likely want to use bulk insert when it improves performance, but continue to use EF for adding/updating other objects because it is so much more convenient. This article shows how to accomplish this using the EF.
NOTE: consider disabling lazy loading in conjunction with the below technique. Lazy-loading (combined with the EF "fixup" feature) has serious performance implications, especially when large numbers or objects are involved.
For example, imagine we are creating a business function that inserts 1 object of type Customer and 10000 of type Order. To get better performance, we'll use a bulk insert for the Order objects:
public void CreateCustomer(int id) { using(var dc = GetContext()) { var customer = new Customer { Id = id }; dc.Customers.AddObject(customer); IList<order> orders = new List<order>(); for(int orderId = 1; orderId <= 10000; orderId++) { // for example simplification, I did not associate order with customer var order = new Order { Id = orderId }; orders.Add(order); } var copy = new SqlBulkCopy(dc.Database.Connection.ConnectionString, "Orders"); // AsDataReader() taken from http://code.msdn.microsoft.com/LinqEntityDataReader copy.WriteToServer(order.AsDataReader()); dc.SaveChanges(); } }
This will "work", but it definitely will not be transactional. Heck, it won't even all occur using the same SQL connection! This is because SqlBulkCopy is given a connection string, not a connection - so it won't share a connection with the context. Second, SqlBulkCopy will begin and end its own transaction, as will the call to SaveChanges(). This means that if there is a failure the Customer might be inserted but not the Orders, or vice versa.
To make this transactional we must (unfortunately) apply several reflection and casting hacks in the process of performing the following steps:
- Forcibly open a database connection
- Manually begin and end an entity transaction
- Obtain the current SQL transaction from the entity transaction
- Instantiate a SqlBulkCopy object using the connection and transaction
public void CreateCustomer(int id) { using(ObjectContext dc = GetContext()) { // force open the connection // if we don't explicitly open it first, the EF will open and close it at will // making it impossible to reliably obtain one when we need it dc.Connection.Open(); try { // create a transaction now so that either the entire operation // succeeds or nothing succeeds. Also, we'll need this transaction // later when instantiating the SqlBulkCopy object using (var tx = entityConnection.BeginTransaction()) { var customer = new Customer { Id = id }; dc.Customers.AddObject(customer); IList<order> orders = new List<order>(); for(int orderId = 1; orderId <= 10000; orderId++) { // for example simplification, I did not associate order with customer var order = new Order { Id = orderId }; orders.Add(order); } BulkInsert(dc, orders.AsDataReader()); dc.SaveChanges(); } } finally { Context.Connection.Close(); } } } public void BulkInsert(ObjectContext context, IDataReader reader) { var sqlConnection = context.Connection.StoreConnection as SqlConnection; if (sqlConnection != null) { var parserProp = sqlConnection.GetType().GetProperty("Parser", BindingFlags.NonPublic | BindingFlags.Instance); if (parserProp != null) { var parser = parserProp.GetValue(sqlConnection, null); var sqltxProp = parser.GetType().GetProperty("CurrentTransaction", BindingFlags.NonPublic | BindingFlags.Instance); var currentTransaction = sqltxProp.GetValue(parser, null); var sqlTransaction = currentTransaction as SqlTransaction; if (sqlTransaction == null) { // this means that the current transaction is likely an SqlInternalTransaction and // that we must obtain the true, underlying transaction from the Parent property var parentProp = currentTransaction.GetType().GetProperty("Parent", BindingFlags.NonPublic | BindingFlags.Instance); currentTransaction = parentProp.GetValue(currentTransaction, null); sqlTransaction = currentTransaction as SqlTransaction; } if (sqlTransaction != null) { var sqlBulkCopy = new SqlBulkCopy(sqlConnection, SqlBulkCopyOptions.Default, sqlTransaction); sqlBulkCopy.DestinationTableName = typeof(T).Name; sqlBulkCopy.WriteToServer(entities.AsDataReader()); } } } }
This technique is limited in that objects you are bulk inserting are not tracked by the EF. This means that the EF is unaware of any object that is bulk inserted which means there is the potential for conflicts if the same object is both bulk inserted and inserted via AddObject(). Also, SqlBulkCopy will not use any of the EF's navigational properties because they have no meaning to it. You must instead ensure all required scalar properties are initialized.
If the limitations are not an issue for you or easy to avoid (which in general they should be), this technique is useful for improving the efficiency of any business function that requires the insertion of large numbers of records.
Sunday, March 18, 2012
Never Explicitly Call DateTime.Now
In order to evoke certain scenarios from your application testing, it is often necessary to make your system think the current time is some point in the past or future. Traditionally, this is done by modifying the Windows system date/time. But a change like this affects ALL applications, not just your own -- with potentially nasty results.
At the heart of this issue is the fact that accessing the property DateTime.Now is typically scattered everywhere across an application's code base. Fetching the value in this manner - explicitly - tightly couples your code to the system clock.
A better way would be to treat the system clock as a separate concern. Instead of accessing DateTime.Now directly, create an interface, called IClock for example, that indirectly retrieves the current time. By using IClock, objects would be loosely coupled to the system clock. In fact, a sophisticated implementation if IClock would allow for the current time to be offset by a certain amount, allowing an application to be tested "in the past" or "in the future", all without having to adjust the actual system time.
At the heart of this issue is the fact that accessing the property DateTime.Now is typically scattered everywhere across an application's code base. Fetching the value in this manner - explicitly - tightly couples your code to the system clock.
A better way would be to treat the system clock as a separate concern. Instead of accessing DateTime.Now directly, create an interface, called IClock for example, that indirectly retrieves the current time. By using IClock, objects would be loosely coupled to the system clock. In fact, a sophisticated implementation if IClock would allow for the current time to be offset by a certain amount, allowing an application to be tested "in the past" or "in the future", all without having to adjust the actual system time.
Tuesday, March 13, 2012
Avoid Custom .NET Exception Boilerplate Code
In order to guarantee a custom .NET exception will work in all scenarios, Microsoft recommends the exception class observe certain conventions. Specifically, the class should implement certain constructors and serialization methods. This means each of an application's custom exceptions will likely end up with the same boilerplate code. If an application has dozens of unique exceptions this can add up to a lot of code duplication, with each exception having to implement the same constructors and serialization code.
This code duplication can be avoided by using a single, generic exception used application-wide. This solution involves the creation of a single DomainException<T> class with one or more serialization exception details classes:
This code duplication can be avoided by using a single, generic exception used application-wide. This solution involves the creation of a single DomainException<T> class with one or more serialization exception details classes:
[Serializable] public class DomainException<T> : Exception { public DomainException(T exceptionDetails) : base(default(String)) { Details = exceptionDetails; } public T Details { get; private set; } public DomainException() { // no argument constructor required for serialization } public DomainException(T exceptionDetails, Exception innerexception) : base(default(String), innerexception) { Details = exceptionDetails; } protected DomainException(SerializationInfo si, StreamingContext sc) : base(si, sc) { if (si == null) throw new NullReferenceException("si"); Details = (T)si.GetValue("exceptionDetails", typeof(T)); } }This exception class acts as a container for another simple property-only class that includes the exception details:
[Serializable] public class MyExceptionDetails { public MyExceptionDetails(int failureCode) { FailureCode = failureCode; } public int FailureCode { get; private set; } }A specific error details class like this one would be created where, in the past, an entire exception would have been created. Only the [Serialization] attribute is required to make it serialize-able/compatible with DomainException<T>. Throwing this exception would look like:
throw new DomainException<MyExceptionDetails>(new MyExceptionDetails(1001));And because the exception is a concrete type it can be explicitly caught:
try { // some code here } catch(DomainException<MyExceptionDetails> ex) { // examine ex.Details }
Friday, March 9, 2012
Alternative to Visual Studio Solution Folders
Visual Studio provides solution folders as a means to organize items in a hierarchical fashion, especially items that are not part of any one project. This works when existing items are added to solution folders, but not new items. This is because solution folders are virtual - the hierarchy of folders as they appear within the solution is not mirrored on disk. In fact, no folders are created on the physical disk and an item that is added to a solution folder is simply placed in the root solution directory.
To work around the solution folder deficiency I simply created a new project to hold ancillary items. Specifically:
To work around the solution folder deficiency I simply created a new project to hold ancillary items. Specifically:
- Create a new C# class library project called "Accessories".
- Remove all classes from the new project.
- From the solution configuration manager, un-check the project's build flag for all configurations.
- Add folders and items to the project, setting each item's build action property to "None".
Friday, February 24, 2012
C# Socket Hello World Example
Often a better understanding of a programming concept can be had by looking at a bare bones example. Below is such an example of using .NET sockets to establish communication between a server and one or more clients that does without the fluff. The server code:
The server will establish connections as often as their are clients. An AutoResetEvent is used to signal when the server has established a connection with a client and should ready itself for another connection. The client:
class Server { static void Main(string[] args) { // create the listener socket IPAddress ipAddress = IPAddress.Parse("127.0.0.1"); Socket serverSocket = new Socket(ipAddress.AddressFamily, SocketType.Stream, ProtocolType.Tcp); IPEndPoint ipEndPoint = new IPEndPoint(ipAddress, 52000); serverSocket.Bind(ipEndPoint); serverSocket.Listen(100); AutoResetEvent connectedSignal = new AutoResetEvent(false); // establish connections with clients while (true) { serverSocket.BeginAccept(new AsyncCallback(AcceptCallback), new Tuple<Socket, AutoResetEvent>(serverSocket, connectedSignal)); // wait here until we've established a connection connectedSignal.WaitOne(); } } // asynchronous callback to accept a connection static void AcceptCallback(IAsyncResult acceptAsynchResult) { var connectionObject = (Tuple<Socket, AutoResetEvent>)acceptAsynchResult.AsyncState; // signal to the loop in Main that we've established a connection connectionObject.Item2.Set(); Socket receiveSocket = connectionObject.Item1.EndAccept(acceptAsynchResult); // receive the data var readData = new Tuple<Socket, byte[]>(receiveSocket, new byte[1024]); receiveSocket.BeginReceive(readData.Item2, 0, readData.Item2.Length, SocketFlags.None, new AsyncCallback(ReadCallback), readData); } // asynchronous callback to read socket data static void ReadCallback(IAsyncResult readAsyncResult) { var readData = (Tuple<Socket, byte[]>)readAsyncResult.AsyncState; // handle SocketException where SocketErrorCode == SocketError.ConnectionReset // to gracefully handle sudden client disconnects int read = readData.Item1.EndReceive(readAsyncResult); if (read > 0) { // write received data to the console Console.WriteLine(Encoding.UTF8.GetString(readData.Item2, 0, read)); // set up to receive more data readData.Item1.BeginReceive(readData.Item2, 0, readData.Item2.Length, SocketFlags.None, new AsyncCallback(ReadCallback), readData); } } }
The server will establish connections as often as their are clients. An AutoResetEvent is used to signal when the server has established a connection with a client and should ready itself for another connection. The client:
IPEndPoint ipe = new IPEndPoint(IPAddress.Parse("127.0.0.1"), 52000); clientSocket = new Socket(ipe.AddressFamily, SocketType.Stream, ProtocolType.Tcp); clientSocket.Connect(ipe); clientSocket.Send(Encoding.UTF8.GetBytes("Hello World")); clientSocket.Shutdown(SocketShutdown.Both); clientSocket.Disconnect(false);
Adding blogger syntax highlighting
Blogger doesn't support syntax highlighting for code by default - you have to add it. But it's a snap with the instructions here: http://heisencoder.net/2009/01/adding-syntax-highlighting-to-blogger.html. Plus, it's all inline. There is a newer version, but it requires you add a link to javascript at alexgorbatchev.com which I wanted to avoid if I could.
Wednesday, February 22, 2012
PowerShell's Read-Host returns immediately
If you're like me and to save time copy and paste a command line you use repeatedly into the PowerShell command window, you could cause issues for the Read-Host cmdlet. For example, say I pasted these commands:
cd c:\somedirectory
.\MyScript.ps1 someparameters
If I, when I copied that text, included the trailing carriage return/line feed at the end, that would cause those two commands to execute the instant I pasted them into the PowerShell window. The problem is that if MyScript.ps1 contains a call to Read-Host it will trick it into thinking the enter key was pressed, when it was not, and your first Read-Host will fail to block.
To get around this I either 1) make sure I manually enter the command or 2) make sure I don't include the trailing newline when I copy the command text.
cd c:\somedirectory
.\MyScript.ps1 someparameters
If I, when I copied that text, included the trailing carriage return/line feed at the end, that would cause those two commands to execute the instant I pasted them into the PowerShell window. The problem is that if MyScript.ps1 contains a call to Read-Host it will trick it into thinking the enter key was pressed, when it was not, and your first Read-Host will fail to block.
To get around this I either 1) make sure I manually enter the command or 2) make sure I don't include the trailing newline when I copy the command text.
System.NotSupportedException instantiating a .NET IpcChannel
Instantiating a System.Runtime.Remoting.Channels.Ipc.IpcChannel was throwing a System.NotSupportedException. In Visual Studio, hovering over the exception showed it to be a plain System.Exception containing a property _COMPlusExceptionCode with a value of -532462766. I've also seen a RemotingException, "Failed to create an IPC Port: Access is denied", so the error message you receive may be inconsistent as well.
The problem appears to be a lingering, identical IpcChannel within the same process, even one that is no longer registered but has not yet been completely cleaned up. The _COMPlusExceptionCode threw me off for a while, but it appears a solution has been found here:
http://social.msdn.microsoft.com/Forums/en/netfxremoting/thread/d154e4a9-3e31-41a5-944c-db867ca77e9e
However this solution will permit multiple connections to be open with potentially undesirable consequences. A loop that makes several attempts to create the IpcChannel may be a more appropriate solution.
The problem appears to be a lingering, identical IpcChannel within the same process, even one that is no longer registered but has not yet been completely cleaned up. The _COMPlusExceptionCode threw me off for a while, but it appears a solution has been found here:
http://social.msdn.microsoft.com/Forums/en/netfxremoting/thread/d154e4a9-3e31-41a5-944c-db867ca77e9e
However this solution will permit multiple connections to be open with potentially undesirable consequences. A loop that makes several attempts to create the IpcChannel may be a more appropriate solution.
Wednesday, February 15, 2012
Transactions as an Aspect
Documents that introduce AOP use the same examples when describing its practical applications. The first two, in this order, are always:
However, in multi-threaded environments, business object must also perform synchronization. That is, when performing an operation the business object must guarantee it's not accessing shared data with itself on a different thread. To prevent this, it must acquire object locks. Since the locks are private (as they should be, to preserve encapsulation) to the object in question, and because how or what locks are acquired may change depending upon the method parameters and/or logic, it is up to each individual public object operation to perform the locking.
When an operation on an object with a transaction aspect applied the transaction would start first, then the object locks would be acquired. This is a problem because you never want to start a database transaction first. If an object lock takes longer to acquire than a database transaction is long the operation will fail. Instead, we want to acquire object locks first, then begin the transaction.
This would seem to indicate that transactions can't be represented as an aspect. However, we won't give up quite yet. One could argue that the synchronization I described could also be captured as an aspect. You could apply both aspects to the business object so long as you ensure the synchronization aspect is a layer above the transaction aspect. But can the synchronization aspect have enough knowledge to understand what locks to acquire? Would it not have to duplicate logic already present in the business object? How would this fit into a design that cached data access which had to acquire its own locks? Would we not be setting ourselves up for deadlocks?
- Logging
- Transactions
However, in multi-threaded environments, business object must also perform synchronization. That is, when performing an operation the business object must guarantee it's not accessing shared data with itself on a different thread. To prevent this, it must acquire object locks. Since the locks are private (as they should be, to preserve encapsulation) to the object in question, and because how or what locks are acquired may change depending upon the method parameters and/or logic, it is up to each individual public object operation to perform the locking.
When an operation on an object with a transaction aspect applied the transaction would start first, then the object locks would be acquired. This is a problem because you never want to start a database transaction first. If an object lock takes longer to acquire than a database transaction is long the operation will fail. Instead, we want to acquire object locks first, then begin the transaction.
This would seem to indicate that transactions can't be represented as an aspect. However, we won't give up quite yet. One could argue that the synchronization I described could also be captured as an aspect. You could apply both aspects to the business object so long as you ensure the synchronization aspect is a layer above the transaction aspect. But can the synchronization aspect have enough knowledge to understand what locks to acquire? Would it not have to duplicate logic already present in the business object? How would this fit into a design that cached data access which had to acquire its own locks? Would we not be setting ourselves up for deadlocks?
Documenting Code with Code
On a regulated software project, especially one regulated by a government body, design documentation is a must. It proves that designs were thought out and not developed in an ad hoc fashion, and captures key considerations made when the software was developed so that future developers, who may not have been part of the original project, are able to safely perform maintenance.
There are two popular methods for creating design documentation. One method is to use tools like Word and Visio to manually create documentation. This method gives the writer absolute control over the contents of the document. However, this document will likely repeat documentation already found in your C# XML comments. Also, creating Visio diagrams by hand can be tedious and time consuming. And just because the documentation is being written doesn't mean the design won't change. In many Agile environments, hand written documentation will likely be obsolete the minute it is published.
Another method involves using a documentation generation tool, like doxygen for example. Tools like this extract documentation from code and format it as html, a pdf, etc. Because documentation can be generated on the fly, it will always be relevant. However, its really only good as reference documentation -- without sequence diagrams or textual design summaries providing context it would be difficult for a developer to understand the code for maintenance purposes. For those tools that can automatically generate object model diagrams and sequence diagrams from code, how will it know what level of detail is sufficient? How can it possibly have enough understanding of the code to diagram things into meaningful compartments. To get all of the benefits, but none of the drawbacks, a documentation tool that resides somewhere in the middle is needed.
Instead of either of the previous methods, build a documentation generation framework where documentation is generated from code. More precisely, the documentation IS code. Think about it:
There are two popular methods for creating design documentation. One method is to use tools like Word and Visio to manually create documentation. This method gives the writer absolute control over the contents of the document. However, this document will likely repeat documentation already found in your C# XML comments. Also, creating Visio diagrams by hand can be tedious and time consuming. And just because the documentation is being written doesn't mean the design won't change. In many Agile environments, hand written documentation will likely be obsolete the minute it is published.
Another method involves using a documentation generation tool, like doxygen for example. Tools like this extract documentation from code and format it as html, a pdf, etc. Because documentation can be generated on the fly, it will always be relevant. However, its really only good as reference documentation -- without sequence diagrams or textual design summaries providing context it would be difficult for a developer to understand the code for maintenance purposes. For those tools that can automatically generate object model diagrams and sequence diagrams from code, how will it know what level of detail is sufficient? How can it possibly have enough understanding of the code to diagram things into meaningful compartments. To get all of the benefits, but none of the drawbacks, a documentation tool that resides somewhere in the middle is needed.
Instead of either of the previous methods, build a documentation generation framework where documentation is generated from code. More precisely, the documentation IS code. Think about it:
- Don't ever hand draw diagrams - use code to describe them and have the framework generate them. Include as much or as little detail as is needed.
- Generate your documentation with every build - use strong types within the documentation so your build will break if it isn't up to date which means your documentation is never obsolete.
- Incorporate your C# comments at will.
- Allow for blocks of text - they can accompany diagrams to provide in-depth descriptions.
- Tweak your framework to completely change the look and style of your generated documentation in one fell swoop.
Subscribe to:
Posts (Atom)