The PODA Blog

News, views and articles from our membership

Archive for June, 2007

Text Data Sources in MS Access (part 2)

Posted by Tim Critchley on 12th June 2007

Judging a file by its cover

In the opening episode of this discussion, I suggested that many developers may be wary of flat files performing the role of data source for their system. From personal experience, I’d attribute the bulk of that sentiment to the fact that however reliable their source, text files are akin to the box of proverbial chocolates - and unless the dancing part of your system takes full responsibility for evaluating their content, your project will likely wind up in another shade of proverbial brown.

That word “evaluating” immediately infers that the project as a whole just became more complicated! Let’s consider some implied questions:

  • How much data does the file contain?
    A developer brash enough to overlooks this consideration, runs the real risk that at some point, the source file may be only partially populated or even empty. Without effective vigilance, database corruption will visit like a thief in the night!
  • How much data is expected?
    This question is an extrapolation of the first. To validate the amount of data contained in the source file, you need to have some idea of how much is “too little”.
  • What data types are involved?
    Acknowledging that a detailed profile of the data file is critical makes this an immediate requirement. A single rogue comma can render your static table corrupt, and the gremlins spawned when interpreting date formats standard from one region, in another, are a regular feature in online Q&A forums.

Taking just these three points to their logical conclusion may be enough to prompt even the most enterprising of developers to seek alternative arrangements. And in situations where no alternatives present themselves, the temptation to “muddle through” can seem comparatively appealing.

What follows is a task-specific guide to a comprehensive set of features native to Access, which when fully appreciated and assembled correctly; present a robust response to these challenges.

 

Smart Dino’

Now having the measure of our enemy, it’s time I introduced an invention of mine, the TREx Protocol – “Text Retrieval and Extraction”. I tried coming up with a mnemonic which conjured images of a lither creature, but this one describes the process best. As well as the utilisation of specific features unique to MS Access, TREx also represents a collection of strategic choices which compliment the technologies. The core of these being:

  1. Always process text files locally (on the client machine’s hard drive)
  2. Port text files with your Access project as linked tables
  3. Maintain static copies of the text file data in fully engineered Access tables

 

Protocol 1

MSDN KB article 300216 is essential reading material for budding Access developer and old hands alike. But the section “ADDITIONAL BEST PRACTICES FOR NETWORK ENVIRONMENTS” illustrates well the motivation for the first TREx directive. Access databases are at their most vulnerable when executing a data transaction over a network. As the article warns:

Microsoft Jet is a file-sharing database system. A file-sharing database is one in which all the processing of the file takes place at the client. When a file-sharing database, such as Microsoft Jet, is used in a multiuser environment, multiple client processes are using file read, write, and locking operations on the same shared file across a network. If, for any reason, a process cannot be completed, the file can be left in an incomplete or a corrupted state.

That’s to say, if one or both of the transaction members (client machine or the LAN) fails mid-process, database corruption is a very real danger.

So the smaller we keep that window of tragedy the better! Thus the first TREx protocol “process text files locally” offers a three-fold gain:

  • By excluding network issues, the number of transaction members during the text data assimilation process is effectively halved
  • The application can continue to function where used remotely, or if the network is unavailable for extended periods
  • Additionally, there is potential for a significant performance boost when reading a file from the hard drive, rather than interrogating it over a network

In most mid-large scale companies, text files are written from a mainframe or FTP server to a public folder on the LAN, from where various tasks can reference them. So let’s add another benefit to protocol 1:

  • It precludes file-locking conflicts

The API function CopyFile does this job nicely. It will copy a file even if it’s in use at the time, while the similar CopyFileEx can be used to provide additional feedback during the copy process. See the attached file containing sample routines demonstrating usage of the latter.

Portable CopyFileEx CodeAnalysis of the remaining TREx protocols however, continues in the next episode!

Posted in Access, Office (All) | 3 Comments »