Click here to Skip to main content
15,881,803 members
Articles / Productivity Apps and Services / Microsoft Office
Tip/Trick

VBA Code for Parsing US Address Information

Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
25 May 2018CPOL3 min read 11.4K   8  
VBA Code that will take a string and attempt to pull out the company name, street address, city, state, zip code, phone number, and web address

Introduction

I have been manually entering address information in a Microsoft Access application I have for years, and basically all I have been doing is reparsing information that I can just copy bulk from the web. I knew this was stupid, and I finally decided to do something about it. I searched the web thinking that maybe I could use regular expression to do this. Never did find anything close, and so finally decided to do it myself.

NOTES

The way I use this code is to give preliminary values, and expect to have to edit some of the values to make corrections. This works fine for me since it is used to more quickly auto fill out a form, instead of having to type or copy and paste individual elements.

This is really dirty code. I just got it to work and that was all I was really interested in. It is VBA after all, and only done for one of my own pet Microsoft Access projects. Will probably do some cleanup in the future.

Limitations

This code is pretty basic and has many limitations:

  • The company name is expected to be the first information (after removing white space) in the string and ends with either a comma or a carriage return. Once extracted, the information is removed so that it will not interfere with later operations.
  • The phone number is extracted with a regular expression. It is one of the first things done, and once extracted, the phone information is removed so that it will not interfere with later operations.
  • Likewise, the web address is extracted with a regular expression. It is one of the first things done, and once extracted, the web address information is removed so that it will not interfere with later operations.
  • The street address is assumed to start in the first location that contains a number, and it is assumed to end with either a comma or a carriage return. It is then removed.
  • The City is assumed to come after the street address with a comma separating it from the two letter State with the five or nine number zip code after that.

Including the Reference to Use RegExp

If you are using Microsoft Access, this is how you VBA reference to "Microsoft VBScript Regular Expressions 5.5":

  1. Select "Create" tab on the Ribbon.
  2. Find "Macros and Code" ribbon section and click "Visual Basic".
  3. In "Microsoft Visual Basic for Applications" window, select "Tools" from the top menu.
  4. Select "References...".
  5. Check the box next to "Microsoft VBScript Regular Expressions 5.5" to include in your workbook.
  6. Click "OK".

The Code

The code is as follows:

VB.NET
Option Compare Database
Public Function Parse(str As String) As Collection
    Dim output As New Collection
    str = TrimLeftImproved(str)
   
    phonenumber = getPhoneNumber(str)
    uri = GetURI(str)
   
    'Find Company Name...assume part before ',' or 'cr'
    location = getMaxInstr(str, ",", vbCr)
    output.Add Left(str, location - 1) 'Company
    str = Trim(Mid(str, location + 1))
   
    'Find Street Address...assume part before ',' or 'cr'
    'Assume street address starts with number
    'If there are not two blanks in the string it is not an address
    isNotValidStreet = True
    Do While isNotValidStreet
        location = GetPositionOfFirstNumericCharacter(str)
        str = TrimLeftImproved(Mid(str, location))
        location = getMaxInstr(str, ",", vbCr)
        testStreet = Left(str, location - 1)
        splitstr = Split(testStreet, " ")
        If UBound(splitstr) > 1 Then
            isNotValidStreet = False
        End If
        str = TrimLeftImproved(Mid(str, location + 1))
    Loop
    output.Add Left(testStreet, location - 1) 'Street Address
   
    'Find City...assume all words before
    location = getMaxInstr(str, ",", vbCr)
    output.Add Left(str, location - 1) 'City
    str = TrimLeftImproved(Mid(str, location + 1))
    
    'Find State...assume two letter abbreviation.
    State = Mid(str, 1, 2)
    output.Add UCase(State) 'state
    str = TrimLeftImproved(Mid(str, 3))
   
    'Find Zip Code
    strPattern = "^\d{5}(?:[-\s]\d{4})?$"
    Dim regEx As New RegExp
       With regEx
            .Global = True
            .Multiline = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With
    ZipCode1 = Mid(str, 1, 10)
    ZipCode2 = Mid(str, 1, 5)
    If (regEx.test(ZipCode1)) Then
        ZipCode = ZipCode1
        str = TrimLeftImproved(Mid(str, 10))
    ElseIf (regEx.test(ZipCode2)) Then
        ZipCode = ZipCode2
        str = TrimLeftImproved(Mid(str, 5))
    Else
        ZipCode = ""
    End If
    output.Add GetZip(str) 'ZipCode
   
    output.Add phonenumber
    output.Add uri
    Set Parse = output
End Function

Private Function getMaxInstr(str, val1, val2) As Integer
    location1 = InStr(str, val1)
    location2 = InStr(str, val2)
    If location1 = 0 Then
        location = location2
    ElseIf location2 = 0 Then
        location = location1
    Else
        location = IIf(location1 < location2, location1, location2)
    End If
    getMaxInstr = location
End Function

Public Function GetPositionOfFirstNumericCharacter(ByVal s As String) As Integer
    For i = 1 To Len(s)
        Dim currentCharacter As String
        currentCharacter = Mid(s, i, 1)
        If IsNumeric(currentCharacter) = True Then
            GetPositionOfFirstNumericCharacter = i
            Exit Function
        End If
    Next i
End Function

Function getPhoneNumber(str) As String
    'Use Regular Expression for grabbing the input and automatically filter it
    Dim regEx As New RegExp
    With regEx
        .Global = True
        .Multiline = True
        .IgnoreCase = True
        'This matches the pattern: e.g. <a href="mailto:06+900@07+230">06+900@07+230</a>
        .Pattern = "(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}"
    End With
   
    If regEx.test(str) Then
        getPhoneNumber = regEx.Execute(str)(0)
        str = regEx.Replace(str, "")
    Else
        getPhoneNumber = ""
    End If
End Function

Function GetURI(str) As String
    'Use Regular Expressions for grabbing the input and automatically filter it
    Dim regEx As New RegExp
    With regEx
        .Global = True
        .Multiline = True
        .IgnoreCase = True
        .Pattern = "(((ht|f)tp(s?))\://)?(<a href="http://www.|_
                   [a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|_
                   us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&amp;%\$#\=~_\">www.|_
                   [a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|_
                   us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&amp;%\$#\=~_\</a>-]+))*"_
    End With
   
    If regEx.test(str) Then
        GetURI = regEx.Execute(str)(0)
        str = regEx.Replace(str, "")
    Else
        GetURI = ""
    End If
End Function

Function GetZip(str) As String
    'Use Regular Expression for grabbing the input and automatically filter it
    Dim regEx As New RegExp
    With regEx
        .Global = True
        .Multiline = True
        .IgnoreCase = True
        'This matches the pattern: e.g. <a href="mailto:06+900@07+230">06+900@07+230</a>
        .Pattern = "[0-9]{5}(?:-[0-9]{4})?"
    End With
   
    If regEx.test(str) Then
        GetZip = regEx.Execute(str)(0)
        str = regEx.Replace(str, "")
    Else
        GetZip = ""
    End If
End Function

Function TrimLeftImproved(str) As String
    For i = 1 To Len(str)
        Dim currentCharacter As String
        currentCharacter = Mid(str, i, 1)
        If IsNumeric(currentCharacter) Or (Asc(currentCharacter) >= 65 And _
                           Asc(currentCharacter) <= 122) Then
            TrimLeftImproved = Mid(str, i)
            Exit Function
        End If
    Next i
End Function

It will be noted that the address values are returned in a collection. This allows reuse. Also, if the search for the web URI and phone number does not work, then just find a Regular Expression that works better for you.

Using the Code

Here is an example of using this code:

VB.NET
Dim values
Set values = Parse(FullStringEntry)
If Company = "" Then
    Company = values(1)
    Address = values(2)
    City = values(3)
    State = values(4)
    ZipCode = values(5)
    Phone = values(6)
    WebSite = values(7)
End If

The Set for assigning the result of Parse is important, and will not work without the Set in the assignment.

Test Input

Here is some text that was copied on a bing search, and it works:

Extended Stay America

<a href="http://www.extendedstayamerica.com">www.extendedstayamerica.com</a>

6961 Lenoir Ave E, Jacksonville, FL 32216

(904) 296-0181

Here is another example of input the works, copied from Google Maps:

United States Postal Service

4.3

·24 reviews

Post Office

SAVE

NEARBY

Send to your phone

SHARE

 135 E 100 S, Salt Lake City, UT 84111  
Located in: Harmons

 usps.com  
 
 (801) 428-0366

History

  • 2018/05/25: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) Clifford Nelson Consulting
United States United States
Has been working as a C# developer on contract for the last several years, including 3 years at Microsoft. Previously worked with Visual Basic and Microsoft Access VBA, and have developed code for Word, Excel and Outlook. Started working with WPF in 2007 when part of the Microsoft WPF team. For the last eight years has been working primarily as a senior WPF/C# and Silverlight/C# developer. Currently working as WPF developer with BioNano Genomics in San Diego, CA redesigning their UI for their camera system. he can be reached at qck1@hotmail.com.

Comments and Discussions

 
-- There are no messages in this forum --