SQL Generation & Templating

Introduction

I recently answered a question (http://e-e.com/questions/29100944) about updating rows in a table. This was a self-join scenario where some of the rows served as update data sources for other rows with matching keys.

Read More

Efficient String Clean-up

Introduction

I frequently answer questions where strings need to be 'cleaned' of multiple space characters. The most common fix is to remove leading or trailing spaces. For that problem, there are very handy intrinsic VB functions (LTrim, RTrim, Trim). However, these functions do not affect any repeated space characters between the first and last non-space characters -- the internal spaces.

 

In this article, I will explore different solutions to this problem and evaluate their performance. You should be able to apply this in any of the Office products (Access, Excel, PowerPoint, Word), VBScript, or VB6. However, the collection object is not available in the VBScript environment, so you won't be able to use or test any of the methods that use the collection object in your VBS batch jobs.

 

Note: You can use these methods to remove any repeating characters inside a string, not just repeated space characters.

 

Problem Context

You often have little say in the data you must process. Count yourself lucky if you get well-formed, normalized, and structured data. However, if you receive any of the following, this article may help you.

</p>

  • unstructured data (text files) that you will need to parse or format
  • text or memo (varchar) database fields with raw data
  • HTML or XML that may look good in a browser, but isn't easy to parse or format
  • XML or JSON (any tagged data format) that you need to compact before sending or saving

Performance Methodologies

The performance of the measured methods is affected by the length of the strings and the amount of (internal) space characters to be removed. For simplicity, I will evaluate the code and methods with different permutations on the Gettysburg Address -- one paragraph, the entire address, ten copies of the address concantenated in one string, and one hundred copies of the address concantenated in one string. A description of the permutation routine is at the end of this article.

Read More