Set-Content -Encoding parameter matters, if the content says so...

by Klaus Graefensteiner 11/5/2008 7:14:52 AM

Introduction

I use a simple PowerShell script to replace some of the URLs in my blog post's html source before I publish them to www.tellingmachine.com. In my case the posts are stored as XML files. Usually I use Windows Live Writer to write my posts. I publish the documents for test purposes frequently during the authoring to the VisualStudio 2008 development web server that runs locally on my machine. Once the post is ready to put online, I take the XML file, run the PowerShell script against it and then copy it to my production server. Occasionally the XML files refuse to open in Internet Explorer after I ran the script. It took me few minutes to figure out why. Here is the story!

XML Architecture 

Figure 1: XML Architecture

Differences

The differences between a post published to Visual Studio and published to a production server are mainly the root path names of the hyperlink references. In my case I need to change the relative URLs for two things. First the relative picture paths need to be changed and the path to files that are referred to by a download link. Here are the actual replace instructions:

Before moving files from Visual Studio to the production server replace...

  • "/file.axd" with "/file.axd" for downloads
  • "/BlogEngine.Web/image.axd" with "/image.axd" for picture links
  • "http://www.tellingmachine.com" with "http://www.tellingmachine.com"

PowerShell script with a bug

Here is my first attempt to run the replace task with PowerShell. This one produced sometimes malformed XML.

   1: cd $home\desktop\PostPub
   2: $p = dir *.xml | ForEach-Object { `
   3: $text = $_ | get-content
   4: $text = $text -replace 'http://localhost:\d+/BlogEngine.Web','http://www.tellingmachine.com'
   5: $text = $text -replace '/file.axd', '/file.axd'
   6: set-content -Path $_.Fullname $text -force
   7: } `

The bug

The web server would throw an exception, when it tries to read the malformed XML post file. And, when I try to open the XML file directly in Internet Explorer I would get the following error.

Error opening XML in Internet Explorer

Figure 2: Error opening XML in Internet Explorer

First I thought that there is a problem with the regular expression, because when I did the search and replace manually in Visual Studio, then I wasn't able to reproduce the issue. It always works in Visual Studio, but running the script would with some blog posts always cause the error.

The epiphany

I stared at the PowerShell script in PowerGUI, hoping to spot the problem. And to my surprise, I found the solution. Can you see it too?

PowerGUI hint

Figure 3: PowerGUI hint

Exactly, look at the second line in the Locals window: encoding="utf-8". The XML actually prescribes what encoding format to use. From this point on the fix to this problem was a no-brainer. All I needed to do is to specify the encoding parameter in the set-content Cmdlet and that's it.

The fixed PowerShell script

Here is the script that does the job right:

   1: cd $home\desktop\PostPub
   2: $p = dir *.xml | ForEach-Object { `
   3: $text = $_ | get-content
   4: $text = $text -replace 'http://localhost:\d+/BlogEngine.Web','http://www.tellingmachine.com'
   5: $text = $text -replace '/file.axd', '/file.axd'
   6: set-content -Path $_.Fullname $text -force -Encoding UTF8
   7: } `

Recycle App Pool

There is one important note for BlogEngine.NET users. If you publish a blob post XML file manually via FTP to the posts folder of your virtual directory, then you need to invalidate the cache by recycling the App Pool and force the web application to pick up the new post. Otherwise the new post won't be displayed at all.

Download

The resources that this post is based on can be downloaded here: ReplaceURLs.zip

Ausblick

The encoding matters, if the content prescribes it. That is the lesson I learned here. The XML file said it clearly: "The encoding must be utf-8!".

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

BlogEngine.NET | Blogging | Debugging | PowerShell

Powered by BlogEngine.NET 1.3.0.0
Vanilla Theme by Klaus Graefensteiner

About Klaus Graefensteiner

GRAVATAR icon of Klaus Graefensteiner I enjoy the programming of machines.

E-mail me Send mail
Blogroll as OPML OPML LinkedIn Profile View Klaus Graefensteiner's LinkedIn profile

Calendar

<<  November 2008  >>
MoTuWeThFrSaSu
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

View posts in large calendar

Recent comments

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2009

Sign in