ERROR – The output column “Column name” (Lineage Id) on the non-error output has no corresponding output column on the error output

If your OleDB source shows the error above and you are 100% sure that your query is right and you are able to execute it directly from the management studio then the issue is one of the below:

  1. you have 2 columns with the same name but different casing, like ColumnName1 and COLUMNname, if both are needed to give an alias to one of them.
  2. you changed your column names and some of them are cached and conflicting with the new columns, so you need to delete the source component and add a new one, make sure you copy your query before losing it!

SSIS – Global replace for all columns using a script component

Sometimes while in the data flow you need to apply the same function on all the columns in the data flow or all columns have a specific data type, for example:

  1. Remove all commas and new lines from text columns before exporting them to a CSV file.
  2. Replace text “Null” with Null value when loading data from an excel file that has nulls as text in it.
  3. Check for each column if has a null value and replace it with a default value based on the data type.
  4. Convert all strings to upper or lower case. etc…

The easiest way to do this is to add a script component then use the approach below, here i’m converting all strings to upper case.


foreach (PropertyInfo dataColumn in Row.GetType().GetProperties())
{

    // skip columns end with _IsNull as these columns just indicate if the object has a null value
    //also look for columns with string datatype as numeric, booolean, date datat types can't have null as a string
    if (dataColumn.Name.ToLower().EndsWith("_isnull") == false && dataColumn.PropertyType == typeof(string))
    {
            object objValue = dataColumn.GetValue(Row, null);

            if (objValue != null)
            {
                 string value = objValue.ToString().ToUpper();
                 //set the new value
                 dataColumn.SetValue(Row, value, null);
            }
    }
}

few notes about the code above:

  • i’m skipping all columns with name ends with “_isnull” as these columns just to indicate if the original column has a null value as we don’t need them here.
  • always check that the value is not null before converting it to another data type.
  • i’m only applying the change to columns with type of string, you might not need to do that or change it to another type.

Error – External table is not in a valid format when you access an excel file

There could be many reasons for this error; one of them is that the file is locked by another user, for example your file is on a shared drive and another user has the file opened, or the file was uploaded or created by another application and the application still has a lock on it, if this was a one time load then you just need to copy the file and load the copied version of the file.

In case this was an automated recurring process then every time you need to copy the file to another folder, load the copied file then delete it.

hope this is helpful!

SSIS Expression to Get File Name From Full Path


RIGHT( @[User::FullPath] , FINDSTRING(REVERSE( @[User::FullPath] ) , "\\", 1) - 1)

Here is what it does:

  • Reverse the full path so the last backslash (\) becomes the first one.
  • Get the position of the first backslash in the reversed string and this way we would know the length of the file name.
  • Use the RIGHT() function get the file name since we know its length.

Error – Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).

I was using office interop assemblies to do some customization on an excel report the process generates, I was able to do everything I wanted with the excel file on development machine and it worked just fine, but when I deployed to production I started getting the error “Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).” When I run the process through a SQL job, I used SSIS 2012 with the project deployment model but this applies to any process using the office interop assemblies.

Although the process was running fine when I login with the same user executing the job and execute it from visual studio I was still getting the error when it runs through a scheduled job, I had to spend a good time researching the issue and found the answer over few forums so I though to combine everything in one place.

Here is what I did to fix the issue, hopefully this works for you as well:

  • Install Interop Assemblies: make sure either Microsoft office installed  with .NET Programmability Support feature selected or install the assemblies directly from microsoft website.
  • Assemblies Version: check the version of the assemblies on development and production machines, the assemblies will be in the GAC, in widows 7 this folder is %windir%\assembly. of course the versions should match.
  • Desktop Folder: the service uses the desktop folder under systemprofile so you will need to create this folder if not there, here is the location of the folder:
    • For 64 bit applications : C:\Windows\SysWOW64\config\systemprofile\Desktop
    • For 32 bit applications : C:\Windows\System32\config\systemprofile\Desktop
  • DCOM Running User and Permissions: you need to find the excel application entry in the component services console and set the security and identity for it:
    1. First if the user running the service is a domain user make sure it’s added as a user on the server machine.
    2. In the run window type dcomcnfg.
    3. Expand Component Services –> Computers –> My Computer –> DCOM Config.
    4. look for Microsoft Excel Application or CLSID {00024500-0000-0000-C000-000000000046}, if you can’t find it then it’s possible that you are running 32 bit office over 64 bit machine, look at the end of the post for steps to find it.
    5. right click the item and select properties, then select the Security tab.
    6. select Customize for Launch and Activation Permissions and Access Permission and add the user running the service then give it full access.
    7. go to the Identity tab and select This User option and add the user running the service then type it’s password.
    8. click Ok then restart your service.

this step should do it for you, if not you can try the next one.

  • COM Security: you need to set the permission explicitly for the user running the process, here is how to do it:
      1. In the run window type dcomcnfg.
      2. expand Component services –> Computers  then right click on My computer and select Properties.
      3. Select COM security tab and in the Launch and Activation permissions panel click Edit Defaults.

    COM Security

  • add the user executing the service and give it Local Launch and Local Activation permissions.
  • Quit Application: this might not be so much relative but make sure you release the instance of Excel.Application object in your code using the Quit() method, this way your process will not lock the file.

*If you couldn’t find the excel application in the DCOM config then try to find it in the 32 bit console as following:

  • in the run window type mmc -32.
  • go to file and select Add/Remove Snap in…
  • Select component services and click Add.
  • click OK then continue from point number 2 above.

I might have been very generous with the permissions here but I found out it has to do with the domain security policy and your server setup, so after you get this to work I suggest you go back and narrow down the permissions to the minimum that keeps you application running.

SSIS – Importing an excel file with over 255 columns

One of the several limitations comes with the excel driver is the number of columns you can import, you will find that excel only reads the first 255 columns of the file and ignores the rest, this is often not a big issue as most of the files you work with will have less than 255 columns but when it has more then it’s a real issue as there doesn’t seem to be an easy workaround for it.

The best way I found to get around this is to use two data source components, each one reads a different set of columns then use a merge join transformation to combine the data together again, this approach works based on the fact that excel will read both sets of data in the same order they are in, which I found true after trying this on different files, here is how it works:

  1. Add an excel sources to the data flow, set the data access mode to SQL command and set the command to “Select * from [Sheet1$A:IT]“, this tells the driver to return the data for all columns between A and IT which will be the first 255 columns, I’m assuming here the sheet name is Sheet1.
    Excel Source
  2. add another excel source, configure it the same way you did the first one, you can reuse the same connection manager but make the SQL command  “Select * from [Sheet1$IU:MA]”, my last column was MA here so you will need to replace MA with the last column in your sheet.
  3. now we need to add an identity column to each data set so we can merge them together in the next step, we will add a script task and add the identity column as an output column then inside the script assign an incremental values to it, again the data order will not change between the 2 sources, so the first row in the excel file will take the value 1 as identity on each side. output column
    Script
  4. in order to use the merge join we need to sort both of it’s inputs, so we add a sort task on each side and set it to order the data based on the identity column we added earlier in the script task.
  5. now add the merge join and join both outputs.

Data Flow

this way you will have all the columns in the data flow then you can continue processing you data as normal.

SSIS – OnPreExecute and OnPostExecute Fire Multiple Times

If you are trying to use an event handler like OnPreExecute or OnPostExecute in SSIS to log execution information you will notice that each event is being fired multiple times even if you configure the event at package level but you need these events to fire only once!

One quick way to fix this is to add an empty script task at the beginning of the event handler and on the precedence constraint coming out of that task to set:

  • Evaluation Operation = Expression
  • Expression = @[System::SourceName] ==  @[System::PackageName]

Image