Simple PowerShell script to convert a CSV file into an XML file

by Klaus Graefensteiner 24. June 2010 13:31

While I was working on my 50-States blog post I needed a way to convert a CSV file to an XML file. I was hoping that Microsoft’s Excel 2010 would do that, but I couldn’t figure out how. This feature is not discoverable and the help still seems to be worked on. It is completely useless.

PowerShell to the rescue:

OneRingOrPowerShell

Figure 1: One Ring or PowerShell to rule them all

The Script

The scripts assumes that the first line in the CSV file contains the header names of the columns and re-uses the names as attributes. The conversion script needs the names for a single entity and the collection as additional parameters besides the CSV file name. One feature I needed to add was to recombine fields that had a comma in the field name and where enclosed by quotation marks. My script will take care of this situation.

Warning: The script can be used as a good example for a bad coding style.

Set-DebugMode
$CSVPath = "C:\Users\Klaus\SVN\PS\DownloadWeatherHistory\States.csv"



if(Test-Path -PathType leaf -Path $CSVPath)
{
    $CSVLines = Get-content -Path $CSVPath -Encoding "UTF8"
}
else
{
    Write-Error "File $CSVPath not found"
}

$FieldNames = New-Object -TypeName "System.Collections.ArrayList"
$FieldValues = New-Object -TypeName "System.Collections.ArrayList"

if ($CSVLines -ne $null -and $CSVLines.count -gt1)
{
    for($i = 0; $i -lt $CSVLines.count; $i++)
    {
        $values = New-Object -TypeName "System.Collections.ArrayList"
        $fields = ($CSVLines[$i]).Split(",");
        Write-Debug "$fields"
        
        $recombining = $false
        $combinedValue = [String]::Empty
        if($fields -ne $null -and $fields.count -gt 0)
        {
            for($f = 0; $f -lt $fields.count; $f++)
            {
                $value = [String]::Empty
                if($i-eq 0)
                {
                    $slug = ($fields[$f]).replace(" ", "-");
                    Write-Debug $slug
                    $null = $FieldNames.Add($slug)
                }
                else
                {
                    if(($fields[$f]).startsWith('"') -and -not ($fields[$f]).endsWith('"'))
                    {
                        $recombining = $true;
                        $combinedValue = [String]::Empty
                    }
                    

                    if($recombining)
                    {
                        $combinedValue += ($fields[$f]).replace('"', '')
                    }

                    
                    if(($fields[$f]).endsWith('"') -and -not ($fields[$f]).startsWith('"'))
                    {
                        $recombining = $false;
                        write-Debug $combinedValue
                    }
                    
                    if(-not $recombining)
                    {
                        if($fields[$f] -eq $null)
                        {
                            $value = [String]::Empty
                        }
                        elseif($combinedValue -ne [String]::Empty)
                        {
                            $value = $combinedValue
                            $combinedValue = [String]::empty
                        }
                        else
                        {
                            $value = ($fields[$f]).trim()
                        }
                        $null = $values.Add($value)
                    }
                }
            }
        }
        else
        {
            Write-Error "Insufficient row format"
        }
        if($values.count -gt 0)
        {
            $null = $FieldValues.Add($values);
        }
    }
    
}
else
{
    Write-Error "Insufficient csv file format"
}

$FieldNames
$FieldValues.GetType().FullName
$FieldValues.Count

function Write-XML([System.Collections.ArrayList] $Names, [System.Collections.ArrayList] $ValueSets, [Switch] $AsElements, [String] $CollectionName, [String] $EntityName, [String] $Path)
{
    $Lines = New-Object -TypeName "System.Collections.ArrayList"
    
    $Lines.Add('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>') | out-null
    $Lines.Add("<$CollectionName>")
    
    #Foreach item in values
    
    for( $i = 0; $i -lt $ValueSets.count; $i++)
    {
        $ValueSet = $ValueSets[$i]
        $Line = "  <$EntityName ";
        if($ValueSet.Count -ne $Names.Count)
        {
            Write-Error "Datasets are inconsistent"
        }
        else
        {
            for($u=0; $u -lt $Names.Count; $u++)
            {
                 $Line += $Names[$u] + '="' + $ValueSet[$u] + '" '
            }   
        }
        $Line = $Line + "/>`n";
        $Lines.Add($Line) | out-null
        $Line = [String]::Empty
    }
    
    $Lines.Add("</$CollectionName>")
    
    $Lines | set-content -Encoding "UTF8" -Path $Path

}

Write-XML -Names $FieldNames -Values $FieldValues -CollectionName "states" -EntityName "state" -Path "C:\Users\Klaus\SVN\PS\DownloadWeatherHistory\States.xml"



#CALIFORNIA
#CA
#Sacramento
#Los Angeles
#"36
#961
#664"
#"163
#707"
#PT (UTC-8)

#YES


#<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
#<states>
#	<state name="ALABAMA" abbreviation="AL"/>
#	<state name="ALASKA" abbreviation="AK"/>
#	<state name="WISCONSIN" abbreviation="WI"/>
#	<state name="WYOMING" abbreviation="WY"/>
#</states>

Download

The script can be downloaded here: Convert-CsvToXML.zip

Ausblick

You gotta love dynamic languages. PowerShell makes you extremely productive when working with the Windows platform. I hope it will be an object oriented language some day.

Tags: , ,

PowerShell | XML | Tips & Tricks

Comments are closed

About Klaus Graefensteiner

I like the programming of machines.

Add to Google Reader or Homepage

LinkedIn FacebookTwitter View Klaus Graefensteiner's profile on Technorati
Klaus Graefensteiner

Klaus Graefensteiner
works as Developer In Test and is founder of the PowerShell Unit Testing Framework PSUnit. More...

Open Source Projects

PSUnit is a Unit Testing framwork for PowerShell. It is designed for simplicity and hosted by Codeplex.
BlogShell is The tool for lazy developers who like to automate the composition of blog content during the writing of a blog post. It is hosted by CodePlex.

Administration

About

Powered by:
BlogEngine.Net
Version: 1.6.1.0

License:
Creative Commons License

Copyright:
© Copyright 2014, Klaus Graefensteiner.

Disclaimer:
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Theme design:
This blog theme was designed and is copyrighted 2014 by Klaus Graefensteiner

Rendertime:
Page rendered at 11/23/2014 8:49:01 PM (PST Pacific Standard Time UTC DST -7)