ERROR – The output column “Column name” (Lineage Id) on the non-error output has no corresponding output column on the error output

If your OleDB source shows the error above and you are 100% sure that your query is right and you are able to execute it directly from the management studio then the issue is one of the below:

  1. you have 2 columns with the same name but different casing, like ColumnName1 and COLUMNname, if both are needed to give an alias to one of them.
  2. you changed your column names and some of them are cached and conflicting with the new columns, so you need to delete the source component and add a new one, make sure you copy your query before losing it!

SSIS – Global replace for all columns using a script component

Sometimes while in the data flow you need to apply the same function on all the columns in the data flow or all columns have a specific data type, for example:

  1. Remove all commas and new lines from text columns before exporting them to a CSV file.
  2. Replace text “Null” with Null value when loading data from an excel file that has nulls as text in it.
  3. Check for each column if has a null value and replace it with a default value based on the data type.
  4. Convert all strings to upper or lower case. etc…

The easiest way to do this is to add a script component then use the approach below, here i’m converting all strings to upper case.


foreach (PropertyInfo dataColumn in Row.GetType().GetProperties())
{

    // skip columns end with _IsNull as these columns just indicate if the object has a null value
    //also look for columns with string datatype as numeric, booolean, date datat types can't have null as a string
    if (dataColumn.Name.ToLower().EndsWith("_isnull") == false && dataColumn.PropertyType == typeof(string))
    {
            object objValue = dataColumn.GetValue(Row, null);

            if (objValue != null)
            {
                 string value = objValue.ToString().ToUpper();
                 //set the new value
                 dataColumn.SetValue(Row, value, null);
            }
    }
}

few notes about the code above:

  • i’m skipping all columns with name ends with “_isnull” as these columns just to indicate if the original column has a null value as we don’t need them here.
  • always check that the value is not null before converting it to another data type.
  • i’m only applying the change to columns with type of string, you might not need to do that or change it to another type.

Error – External table is not in a valid format when you access an excel file

There could be many reasons for this error; one of them is that the file is locked by another user, for example your file is on a shared drive and another user has the file opened, or the file was uploaded or created by another application and the application still has a lock on it, if this was a one time load then you just need to copy the file and load the copied version of the file.

In case this was an automated recurring process then every time you need to copy the file to another folder, load the copied file then delete it.

hope this is helpful!

SSIS Expression to Get File Name From Full Path


RIGHT( @[User::FullPath] , FINDSTRING(REVERSE( @[User::FullPath] ) , "\\", 1) - 1)

Here is what it does:

  • Reverse the full path so the last backslash (\) becomes the first one.
  • Get the position of the first backslash in the reversed string and this way we would know the length of the file name.
  • Use the RIGHT() function get the file name since we know its length.

Error – Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).

I was using office interop assemblies to do some customization on an excel report the process generates, I was able to do everything I wanted with the excel file on development machine and it worked just fine, but when I deployed to production I started getting the error “Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).” When I run the process through a SQL job, I used SSIS 2012 with the project deployment model but this applies to any process using the office interop assemblies.

Although the process was running fine when I login with the same user executing the job and execute it from visual studio I was still getting the error when it runs through a scheduled job, I had to spend a good time researching the issue and found the answer over few forums so I though to combine everything in one place.

Here is what I did to fix the issue, hopefully this works for you as well:

  • Install Interop Assemblies: make sure either Microsoft office installed  with .NET Programmability Support feature selected or install the assemblies directly from microsoft website.
  • Assemblies Version: check the version of the assemblies on development and production machines, the assemblies will be in the GAC, in widows 7 this folder is %windir%\assembly. of course the versions should match.
  • Desktop Folder: the service uses the desktop folder under systemprofile so you will need to create this folder if not there, here is the location of the folder:
    • For 64 bit applications : C:\Windows\SysWOW64\config\systemprofile\Desktop
    • For 32 bit applications : C:\Windows\System32\config\systemprofile\Desktop
  • DCOM Running User and Permissions: you need to find the excel application entry in the component services console and set the security and identity for it:
    1. First if the user running the service is a domain user make sure it’s added as a user on the server machine.
    2. In the run window type dcomcnfg.
    3. Expand Component Services –> Computers –> My Computer –> DCOM Config.
    4. look for Microsoft Excel Application or CLSID {00024500-0000-0000-C000-000000000046}, if you can’t find it then it’s possible that you are running 32 bit office over 64 bit machine, look at the end of the post for steps to find it.
    5. right click the item and select properties, then select the Security tab.
    6. select Customize for Launch and Activation Permissions and Access Permission and add the user running the service then give it full access.
    7. go to the Identity tab and select This User option and add the user running the service then type it’s password.
    8. click Ok then restart your service.

this step should do it for you, if not you can try the next one.

  • COM Security: you need to set the permission explicitly for the user running the process, here is how to do it:
      1. In the run window type dcomcnfg.
      2. expand Component services –> Computers  then right click on My computer and select Properties.
      3. Select COM security tab and in the Launch and Activation permissions panel click Edit Defaults.

    COM Security

  • add the user executing the service and give it Local Launch and Local Activation permissions.
  • Quit Application: this might not be so much relative but make sure you release the instance of Excel.Application object in your code using the Quit() method, this way your process will not lock the file.

*If you couldn’t find the excel application in the DCOM config then try to find it in the 32 bit console as following:

  • in the run window type mmc -32.
  • go to file and select Add/Remove Snap in…
  • Select component services and click Add.
  • click OK then continue from point number 2 above.

I might have been very generous with the permissions here but I found out it has to do with the domain security policy and your server setup, so after you get this to work I suggest you go back and narrow down the permissions to the minimum that keeps you application running.

SSIS – Importing an excel file with over 255 columns

One of the several limitations comes with the excel driver is the number of columns you can import, you will find that excel only reads the first 255 columns of the file and ignores the rest, this is often not a big issue as most of the files you work with will have less than 255 columns but when it has more then it’s a real issue as there doesn’t seem to be an easy workaround for it.

The best way I found to get around this is to use two data source components, each one reads a different set of columns then use a merge join transformation to combine the data together again, this approach works based on the fact that excel will read both sets of data in the same order they are in, which I found true after trying this on different files, here is how it works:

  1. Add an excel sources to the data flow, set the data access mode to SQL command and set the command to “Select * from [Sheet1$A:IT]“, this tells the driver to return the data for all columns between A and IT which will be the first 255 columns, I’m assuming here the sheet name is Sheet1.
    Excel Source
  2. add another excel source, configure it the same way you did the first one, you can reuse the same connection manager but make the SQL command  “Select * from [Sheet1$IU:MA]”, my last column was MA here so you will need to replace MA with the last column in your sheet.
  3. now we need to add an identity column to each data set so we can merge them together in the next step, we will add a script task and add the identity column as an output column then inside the script assign an incremental values to it, again the data order will not change between the 2 sources, so the first row in the excel file will take the value 1 as identity on each side. output column
    Script
  4. in order to use the merge join we need to sort both of it’s inputs, so we add a sort task on each side and set it to order the data based on the identity column we added earlier in the script task.
  5. now add the merge join and join both outputs.

Data Flow

this way you will have all the columns in the data flow then you can continue processing you data as normal.

SSIS – OnPreExecute and OnPostExecute Fire Multiple Times

If you are trying to use an event handler like OnPreExecute or OnPostExecute in SSIS to log execution information you will notice that each event is being fired multiple times even if you configure the event at package level but you need these events to fire only once!

One quick way to fix this is to add an empty script task at the beginning of the event handler and on the precedence constraint coming out of that task to set:

  • Evaluation Operation = Expression
  • Expression = @[System::SourceName] ==  @[System::PackageName]

Image

Transact-SQL User Function to Subtract Business Days

I came across the requirement of building a function that subtracts business days from a specific date value, the problem with this is you don’t know if a weekend will occur between those days, and if you are looking at more than 7 days then there could be more than one weekend.

The best way I could think of is to use a recursive CTE that keeps incrementing a counter as far as it’s not a weekend day until it reaches the specified number of days.

One thing you need to note about this method is because it uses a recursive CTE by default you will not be able to go over 70 days or so because the max recursion is 100 times so you will need to set the MAXRECURSION option to a number that is over the total of calendar days during that period, it should be added under the select statement like:

SELECT @ReturnDate = MIN(DayDate) FROM Dates

OPTION (MAXRECURSION 1000)

Here is the function:

CREATE FUNCTION fn_SubtractBusinessDays(@startDate date, @Days INT )

RETURNS DATE AS
BEGIN

DECLARE @ReturnDate DATE;

WITH Dates AS
(

SELECT @startDate AS BusinessDay, 0 AS DaysCount

 UNION ALL

 SELECT DATEADD(d,-1,BusinessDay) AS BusinessDay ,
DaysCount + (CASE WHEN DATEPART(dw, DATEADD(d,-1,BusinessDay)) NOT IN (7,1) THEN 1 ELSE 0 END) DaysCount
 FROM Dates
 WHERE DaysCount < @days

)

SELECT @ReturnDate = MIN(BusinessDay) FROM Dates

RETURN @ReturnDate

END

You get an error says “Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON'” when you run an SSIS 2012 package from a stored procedure

You heard that finally Microsoft provided an easy way to run an SSIS package from a stored procedure with SQL server 2012 release when you use the new project deployment model, you deploy your project and call the right stored procedures and as far as you are calling them from the same database server where they are deployed and with a windows account that has the needed permissions everything works great.

Then you decide to create a stored procedure and send it to the web developers or execute form anywhere outside of the database server and you are surprised that the package is throwing the error in the subject Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON’ or any other error implies that the user executing the package has insufficient permissions.

The problem is that integration services as of the 2012 release does not support credentials delegation which means even though the SSIs package is running under the right windows account it does not pass that account when it tries to access the database or file system as they say at the end of this topic.

This is really a problem because in most of the cases your application that needs to call SSIS will reside on a different server and this method is useless, the issue was reported to Microsoft and they said it will be fixed in the next release but you need a solution now! There are few workarounds but none of them is pretty but you might find them helpful in your case:

    1. SQL Authentication: If your problem is only with the database access then you can use SQL authentication instead of windows authentication – although it’s not recommended, and to secure the password you need to parameterize the connection manager and make the password a parameter and check the sensitive checkbox.
    2. Call the execution statement through a SQL job: the only command that matters when it comes to authentication in the call to the start_execution function, which is good because you don’t have to change the whole stored procedure you created and you only do it once and you use it for all packages, this is the option i ended up using as my package needs to access the file system, here is what you need to do:
      • remove the call to start execution from you procedure, which is EXEC [SSISDB].[catalog].[start_execution] @execution_id
      •  Replace it with an insert statement to a queue table; the table will have the following fields:
        • ExecutionID: bigint to hold theid for the execution instance you started.
        • Processed: a bit field to indicate if the instance was executed.
        • any other fields like table id or time stamp would be helpful for you.
      • your stored procedure will look something like this:   
        Create Procedure [dbo].[usp_GenerateReport]
        (
         @DateTo datetime,
         @ExecutionID bigint output
        )
        AS
        BEGIN
        
        Declare @execution_id bigint
        EXEC [SSISDB].[catalog].[create_execution] @package_name=N'PackagName.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'Folder1', @project_name=N'Folder1', @use32bitruntime=True, @reference_id=2
        
        --set parameters in case you have any
        EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=30, @parameter_name=N'DateTo', @parameter_value=@DateTo
        
        --EXEC [SSISDB].[catalog].[start_execution] @execution_id
        
        --insert into my queue table
        Insert into SSISProcessRequest values (@execution_id, 0)
        
        END
        
      • now create an SQL agent job that runs an SQL command under the windows account that you would like to execute the package, it will check for packages with processed flag equals false, execute them then set the flag to true, my service runs every 15 seconds and has the following SQL code:  
        Declare @ExecutionID bigint, @SSISProcessRequestID int
        WHILE EXISTS (SELECT 1 FROM SSISProcessRequest WHERE Processed = 0)
        BEGIN
        
         SELECT TOP 1 @ExecutionID = ExecutionID FROM SSISProcessRequest WHERE Processed = 0 ORDER BY DateRequested;
        
         EXEC [SSISDB].[catalog].[start_execution] @ExecutionID;
        
         UPDATE SSISProcessRequest SET Processed = 1 WHERE ExecutionID = @ExecutionID;
        
        END
        
        
    1. Web Service: create a web service on the SSIS server to call the package, I didn’t try this option myself and usually network admins do not prefer to open web ports on the database servers for security reasons, but it could be an option for you.