How to fix the embedded text qualifier issue while exporting data to CSV flat file? How to fix the embedded text qualifier issue while exporting data to CSV flat file? sql-server sql-server

How to fix the embedded text qualifier issue while exporting data to CSV flat file?


I wouldn't offer this answer except that you worked so hard to document it and it's been upvoted with no answer after a month. So, here goes. Your only choices appear to be to change the data or change the tool.

Probably, I am clearly doing something wrong and missing the obvious. Could someone please explain to me what I am doing wrong here?

When the tool is broken and the vendor doesn't care, it's mistake to keep trying. It's time to switch. You put a lot of effort into researching exactly how it's broken and demonstrating it violates not only the RFC but the tool's own prior version. How much more evidence do you need?

CSV is a boat anchor too. If you have the option, you're better off using an ordinary delimited file format. For lots of applications, tab-delimited is good. The best delimiter IMO is '\' because that character has no place in English text. (On the other hand it won't work for data containing Windows pathnames.)

CSV has two problems as an exchange format. First, it's not all that standard; different applications recognize different versions, whatever the RFC may say. Second (and related) is that it doesn't constitute a regular language in CS terms, which is why it can't be parsed as a regular expression. Compare with ^([^\t]*\t)*[\t]*$ for a tab-delimited line. The practical implication of the complexity of CSV's definition is (see above) the relative dearth of tools to handle them and their tendency to be incompatible, particularly during the wee hours.

If you give CSV and DTS the boot, you have good options, one of which is bcp.exe. It's very fast, and safe because Microsoft hasn't been tempted to update it for years. I don't know much about DTS, but in case you have to use it for automation, IIRC there is a way to invoke external utilities. Beware though, that bcp.exe does not return error status to the shell dependably.

If you're determined to use DTS and to stick with CSV, then really your best remaining option is to write a view that prepares the data appropriately for it. I would, if backed into that corner, create a schema called, say, "DTS2012CSV", so that I could write select * from DTS2012CSV.tablename, giving anyone who cares a fighting chance to understand it (because you'll document it, won't you, in comments in the view text?). If need be, others can copy its technique for other broken extracts.

HTH.


I know this is two years old, but I am also now having this issue, as we need to use SQL Server 2008 for a contract we have (don't ask). After reading through this question, I realized I needed to do the replace suggestion, but when I went to do it in the query, I ran into truncation issues, because using the replace() function in the query itself would convert the text to a varchar(8000) by default.

However, I discovered I could do the same thing using a Derived Column step in between the DB Source and Flat File objects. For example, I have a column named "short_description," that could have quotes in it, so I just used the following function as the expression, and selected "Replace short_description" in the Derived Column:

REPLACE(short_description,"\"","\"\"")

This seems to have solved the issue for me.


Often the first and last name is in the same field and formatted (Last, First). This needs to be text qualified if you're using Tasks->Export Data right off the database (not via SSIS where you have more options) and you need to export to CSV as comma-delimited file.

This will help in your non-null selected fields that need double quoting...

CASE WHEN NOT PersonName IS NULL AND LEN(PersonName) > 0 THEN QUOTENAME(PersonName, '"') ELSE NULL END as 'PersonName'

Result:

PersonName

"COLLINS, ZACKERY E"