Field Calculator: I just want the third word

Usually once a year I have a need to calc a field to a word position in another field in ArcMap. For example, here is a field from some census data I downloaded:

fieldcalc1

Why did they add the text “Census Tract” to all of these? Seems to be a space waster right? All I really want is just the tract number. So what I will do is create a new field, then calc the value to the 3rd word, or tract number, from the other field.

First I create a new field named appropriately TRACTNUM. You simply do this using the attribute table window pulldown > Add field…

fieldcalc2

Next, I enter the field name, set the type to Text, and set the length to 10.

fieldcalc3

My tract numbers will not be longer than 10 characters. Press the OK button and the field will be added to the end of the attribute table.

Next, I scroll over to the new field name in the attribute table and right click on it to select the field calculator.

fieldcalc4

The field calculator appears. I am using ArcMap 10.1, so I select the Python parser, then double click on the NAMELSAD10 field that contains the tract numbers I want to extract, then set the type to String and click on the .split() function. Next in the expression area between () I enter a double quote, a space, and another double quote, then add [2] at the very end. The expression should look like the following:

fieldcalc5

What this will do is take the text from the NAMELSAD10 field, pass it to the split function which uses the space ” ” as the delimiter to separate the words and then extracts the third word for the TRACTNUM field calculation. In Python, position 0 would be the first word, position 1 the second word, and position 2 the third word, which we want. That is why I used [2].

With everything set in the field calculator, click the OK button. Once the calculation is complete, you will see just the tract numbers in the TRACTNUM field.

fieldcalc6

Now that I documented this, I will not forget how to do this next year! -mike

83 thoughts on “Field Calculator: I just want the third word

    • All very helpful but is there a way to extract a string up to a certain character, in this case a hyhpen(-) and just return everything prior to the hyphen? For example in the following string:
      “Mud Lake-Frame Relay” return back “Mud Lake”

      Thanks!

      • Michael,
        Many thanks for your response! I thought I had tried that but to no avail. I’ll keep this for future reference.
        Jim

  1. Pingback: Using Python with the Field Calculator in ArcGIS | GISC

  2. AHHH!!! Thank you! This helped me parse out an address field to its component parts. I’ll make sure I keep a link to this page and save the script. Yours was the only resource, online… in the WORLD, I could find on this subject that was exactly what I needed. Thanks again,
    Brian

  3. A helpful post indeed 🙂

    However, I hit a slag.
    I’m dealing with parcels which are labelled like 000/0, 000 being the number of a parcel and 0 being the subnumber of a parcel if it has one. Because of the backslash, the field type is text. Basically, I can’t do any calculation with such fields. I can’t even sort the attributes ascending or descending because text needs a leading zero(s) to sort the numbers correctly. Therefore, I want to separate the parcel numbers and subnumbers. I separated the numbers with no problem using the split function you wrote about. However, it does not split the subnumbers. I guess that’s because most parcels don’t have subnumbers and are just labelled 000 without /0, so when I tell the function to use “/” as the delimiter to separate the numbers, Python gets confused since most parcels don’t have “/”, let alone a subnumber. Anyway, I was wondering if there’s a solution to this problem.
    Thank you in advance!

    • Hi Marko. One thought is that you select the records that have the “/” in them, then run the field calculator on them to get at the subnumbers in the selected set of records. If parcels with subnumbers always have 3 numbers before the slash, you can select by attributes using “FIELD_NAME” LIKE ‘___/%’ , where FIELD_NAME is your field that has the parcel number in it, three underscores represent any three characters, then the slash, then a percent which is just a wildcard for any number of characters after the slash. Give that a try.

      • Thank you!

        Indeed selecting the records that have a backslash is the most logical thing. The field calculator then makes no issues calculating only the selected features 🙂

  4. Very helpful post! I’ve actually referred to it a couple of times to refresh my memory on how to do this. However, I am currently trying to parse a string that does not have a separator. Is there a similar function that will only return the first couple characters in a string?
    For example I have a field that has pipe diameter and pipe material in one string: “10AC” or “12DI”. I want to separate out the diameter into its own field. Any suggestions?

    • Hi Madeline. Great question. In your examples the pipe diameter are the first two characters, so your expression would contain
      !FIELD_NAME![:2], where [:2] extracts out the first two characters in FIELD_NAME. For example, if a record had FIELD_NAME = “10AC”, your new field would be calced to “10”. -mike

  5. I am attempting to grab text from a string field but what I am attempting to grab is within Parenthesis..

    For example, 123 South Metro A Line (0123)

    I am attempting to only grab the numbers within the parenthesis. Any ideas?

  6. Hi Mike –

    Do you know if this approach be used with labels? I am trying to return just the house number to label parcels with this:

    !PRC_PRIMARY_SITEADDR!.split(” “)[0]

    The code worked when testing in Field Calculator but not when used as an expression with labeling. I am looking for a more permanent solution and to avoid bringing all the data local in order to add the house number field.

    Thanks for any thoughts!

    klf

    • Hi Kayla. With an expression for labels, you will need to use brackets [ ] for your field names (no consistency right? Thanks ESRI!). You also might have some addresses that are blank which would cause a problem with your one line expression. You will need to click on the Advanced checkbox for the label expression and create a python function to pull out the address number. It will test to see if the address is not blank (length > 0), and if so, it will pull out the address number and set the variable to it, if it is blank, it will set the variable to blank. Here is the code to use:

      def FindLabel ( [PRC_PRIMARY_SITEADDR] ):
        addr = str([PRC_PRIMARY_SITEADDR])
        if len(addr.split()) > 0:
          addnum = addr.split(" ")[0]
        else:
          addnum = ""
        return addnum

      Make sure to indent the code like above, it is important to python. I hope that helps!

      mike

      • After thinking about this a little bit more, I thought I could do this with a one line expression instead of a function, but I forgot since you have to use brackets for field names, when you try to use [0] to pull out the first “word” or address number from the field, you get an error because ArcGIS thinks [0] is a field name! So you have to use a python function. However, you can do this with one line in the function and not worry about blank addresses. Using split(” “) will take care of any blanks and NULLs and just give them a space, which will label just fine. So you can use this expression instead:

        def FindLabel ( [PRC_PRIMARY_SITEADDR] ):
          return [PRC_PRIMARY_SITEADDR].split(" ")[0]

        For some reason the [0] is not interpreted as a field name within the function … go figure!

        mike

  7. Mike, that is absolutely awesome – works like a charm! Thank you also for explaining the “niggly” bits, and that the inconsistencies are not my imagination. You be the coding wizard! (o:

    klf

  8. I am trying to remove all the hyphens and periods from a parcel number (01-01-01-01-01-001.111) in a concatenation expression. I just want 0101010101001111 brought over. Is there away I can do that in an expression in the field calculator? Thank you in advance.
    Bart

    • Hi Bart. This should do it:
      !FIELD_NAME!.replace(“-“, “”).replace(“.”, “”)
      The value in field FIELD_NAME is passed to the first replace which will replace any “-” with a blank, and then is passed on to the second replace that will replace any “.” with a blank.
      -mike

      • Thank you Mike, but I am receiving an ERROR 000539 SyntaxERROR: invalid syntax (<expression, line 1) Below is my expression. I am in ArcDesktop 10.5, trying to concatenate two fields into one. Basically I am trying to create paths that link to where we have Adobe parcel information reports. I have Python selected and I am using a String for my type since this field is a text field. I apologize, I am no expression specialist by any means.

        "\\prodat.local\folders\applications\gisvector\t&m\pdf's\Surveys\12\" + !FMTPRCLNO!.replace(“-“, “”).replace(“.”, “”) + ".pdf"

        and I have also tried

        "\\prodat.local\folders\applications\gisvector\t&m\pdf's\Surveys\12\" & !FMTPRCLNO!.replace(“-“, “”).replace(“.”, “”) & ".pdf"

        Thanks in advance,
        Bart

      • Ah, it’s your backslashes. Python tries to interpret them, so you get an error. For every backslash that you want to keep, you have to add another to tell Python not to interpret it. So you need to double every backslash up:

        “\\\\prodat.local\\folders\\applications\\gisvector\\t&m\\pdf’s\\Surveys\\12\\” + !FMTPRCLNO!.replace(“-“, “”).replace(“.”, “”) + “.pdf”

        Try that. -mike

  9. HELLO MIKE, i am trying to remove some the zeros in the last figures of this expression 100/23/34/0034 . i want the expression to appear as 100/23/34/34. Which expression should i use, THX!

    • Hi Brenda. Do you have just one field with values like “100/23/34/0034” or is there a field that has the last number like “0034”? If it is all in one field, then I would use:

      “/”.join([s.lstrip(“0”) for s in !FIELD_NAME!.split(“/”)])

      mike

  10. Hi Mike,

    I am trying to extract just “NP.117” from a field that contains the following text:
    “Dalje Nga Depo RR. Bekim Berisha Abeja NP.117 UR.Depos 79 Mustafe Haxhijaj Montimi”
    How can I achieve that in field calculator with python?

    Thanks in advance

      • Mike thanks for your reply. There is always value in the field that starts with “NP.” but the numbers change (eg. NP.745, or NP.2850). So I would like to extract these values from the existing field and calculate it to another field.

      • Hi Mesud. The code below will work in the field calculator:

        !FIELD1![!FIELD1!.find("NP."):].split()[0]

        FIELD1 is the field that holds the text that you want to extract the “NP.” data. Also make sure that the field calculator parser is set to Python.

        If you want to know what this code is doing, here is an example by breaking it down in python:

        >>> string = "Dalje Nga Depo RR. Bekim Berisha Abeja NP.117 UR.Depos 79 Mustafe Haxhijaj Montimi"
        >>> string.find("NP.")
        39
        >>> string[39:]
        'NP.117 UR.Depos 79 Mustafe Haxhijaj Montimi'
        >>> string[39:].split()
        ['NP.117', 'UR.Depos', '79', 'Mustafe', 'Haxhijaj', 'Montimi']
        >>> string[39:].split()[0]
        'NP.117'
        >>>

  11. Mike thanks for your reply. The code that you provided worked well, but only after I selected the records that have “NP.” in the field. I noticed that there are records with “np.” occurrence! Is there a way to extract both without having to make a selection in the table?

  12. I have a set of addresses such as:
    206 Air Harbor Rd
    5462 Clearwater Ln
    504 600 N Elm St
    2369 3000 Friendly Av

    Any idea on how to make a selection of all records with the ranges (the last two)?

    • Hi Jennifer. This is a hard one, but you are basically looking for addresses that start with two numbers separated by a space. You can use the Select By Attributes tool and do this as the selection statement:

      (FIELD1 LIKE ‘1%’ OR FIELD1 LIKE ‘2%’ OR FIELD1 LIKE ‘3%’ OR FIELD1 LIKE ‘4%’ OR FIELD1 LIKE ‘5%’ OR FIELD1 LIKE ‘6%’ OR FIELD1 LIKE ‘7%’ OR FIELD1 LIKE ‘8%’ OR FIELD1 LIKE ‘9%’) AND (FIELD1 LIKE ‘% 1%’ OR FIELD1 LIKE ‘% 2%’ OR FIELD1 LIKE ‘% 3%’ OR FIELD1 LIKE ‘% 4%’ OR FIELD1 LIKE ‘% 5%’ OR FIELD1 LIKE ‘% 6%’ OR FIELD1 LIKE ‘% 7%’ OR FIELD1 LIKE ‘% 8%’ OR FIELD1 LIKE ‘% 9%’)

      Note FIELD1 is my field that contains addresses. You would need to modify it with your field name. This basically selects all records that start with a number character from 1 to 9, then any other character after that … AND also the character after the first space must be a number character from 1 to 9, then any other character after that. That statement should select all addresses that have the first two “words” as numbers. -mike

  13. Hi Mike.

    I’m not a expert in this area and I was wondering if you could help me out. I’m trying to calculate a new field in where if the first two letters of another field are “FF” my new field will be named as “Wall Mapping”

    This is what I’m trying to do:

    type = !Name![0:2]
    value1 = “FF”
    if type = value1:
    return 1
    end if

    Not working for me at all.

    Hope you can help.

    • Hi Patricia. Just wondering if selecting all values that start with “FF” in the first field, then calcing the value “Wall Mapping” to the second field based on the selected records would be much easier instead of building a python definition in the field calculator? To select records with a value starting with FF, I would just use this expression in the select by attributes tool in ArcMap: YOURFIELDNAME LIKE ‘FF%’ . Next bring up the attribute table, view the selected records, right click on the second field and bring up the field calculator, then just enter your Wall Mapping value to calc all the selected records.

      • Hi Mike.

        thank you very much for your response.

        These values will change, which is why I would like to keep it as a python definition. I also thought about creating another field with only this expression: !Name![0:2] and then continue with:

        if type = “FF”:
        return 1
        else

        end if

        But I want to do it with just one field.

        Is it possible?

      • Hi Patricia. Anything is possible with Python! So right click on your new field to bring up the field calculator, select Python as your parser, and check the Show Codeblock box. Your code should look something like this in the Pre-Logic Script Code box:

        def TwoLetters(value):
          if value[:2] == "FF":
            return 1
          else:
            return 0
        

        Then below that is another box, you enter your call to the TwoLetters function:

        TwoLetters(!Name!)

        Where your field named Name contains the FF characters. When you click OK on the field calculator, the new field will be given a 1 if the field Name has an FF, and a 0 if not. If you don’t want to return 0, then just remove the else and return 0 statements.

        mike

  14. HI MIKE,
    I have 2 question, i was used phyton in arcgis
    firstly,
    i want extract only the alphabet on road number,

    example : 7/2A and the result is “A”
    but for those that not contain alphabet will become “0”

    secondly:

    how to automatic create the running number, as example :

    Special Road 7/1 will become 001
    Special Road 7/2 will become 002
    Special Road 7/3 will become 003

    until last
    Special Road 7/24 become 024

    but if the road :
    horror road 7/1 it will become 025 (continue after last number)

    thanks

    • Hi Hulwan. I was hoping someone would reply, but since no one did, I will give it a try. I can answer part of your questions.

      For the first one, right click on the field you are going to calculate the alpha character to, and bring up the field calculator. Select Python as your parser, and check the Show Codeblock box. Your code should look something like the following in the Pre-Logic Script Code box:

      def alpha(last):
        if last.isalpha():
          return last
        else:
          return "0"

      Then below in the other box, you enter your call to the “alpha” function:

      alpha(!ROADNUM![-1:])

      My field ROADNUM (yours will be different) contains the road numbers (like “7/2A” and “7/3”). When you execute this, the last character in the ROADNUM field is looked at to see if it is an alpha character. If so, the field you are going to calculate will get that alpha character (like “A”). If not, it will get a zero.

      Your next question is that you want running numbers that are zero filled (3 characters I assume) and they are numbers after the “/” in your road numbers. This can be a little tricky because you could also have an alpha character at the end (like “7/2A”), but we can work around that. Again, use the python code block this way:

      def runnum(num):
        if num.isdigit():
          return num.zfill(3)
        else:
          return num[:-1].zfill(3)

      Then below in the other box:

      runnum(!ROADNUM!.split("/")[1])

      This will split your road number into two pieces, one before the “/” and the other after, and take the one after to test if it is a digit (number). If so, it will zero fill the number so it is 3 characters. If not, it will remove the last character (since it is a letter like “A”) giving just the number and zero filled. Do keep in mind you need to make sure the field you are calcing here must be a text field.

      As for the last part of your second question, I am a little confused. I believe you are asking if the road number is repeated, you want to give it a running number that is incremented after that last one. I really don’t know how to do this since it would be random where two or more records have “7/1”. I don’t know how often that happens, but it might be a manual process for you in the end.

      I hope this helps you out. -mike

  15. Hi Mike
    I did not find a solution in the previous answers, so I’ll try it by asking:
    I have a string field with various tree species abbreviations (depends on which are present in the forest area) followed by the value (can be a ratio from 1 to 10 and e in case of single trees) in parenthesis. Different species are separated by a slash. So each line is different, I cannot simply select the second word or something like that because it differs.
    What I need is to extract only the value of TA.
    FI (7) / TA (3) I need the value 3 in my new field
    BU (5) / FI (5) / VBE (e) / UST (e) / TA (e) I need the value e in my new field
    Selecting all the fields that contain TA is already done and I’m using ArcGIS 10.2.2.

    I hope you can help.
    Regards from Europe

    • Hi Alam. Since you have selected all records that contain “TA”, right click on the field you want to calc the values to and open the Field Calculator. Select the Python parser, check the Show Codeblock checkbox, then for the Pre-Logic Script Code enter the following:

      def NewVal(ts):
        pos = ts.split().index("TA")
        val = ts.split()[pos + 1]
        return val[1:][:len(val) - 2]

      Then in the second input box, where “your_field =”, enter the following:

      NewVal(!TSFIELD!)

      Where TSFIELD is the name of the field that stores all those tree specie values.

      Once set just click the OK button and it will pull out the values in the parenthesis and place them in your field you are calcing to. If you need to do it for other values, like “VBE” or “UST”, make sure to change that on the line with the pos variable where you see “TA”. It uses that value to grab the following value in parenthesis.

      To see what is going on here, I tried an example using Python:

      >>> ts = "BU (5) / FI (5) / VBE (e) / UST (e) / TA (e)"
      >>> print ts
      BU (5) / FI (5) / VBE (e) / UST (e) / TA (e)
      >>> ts.split()
      ['BU', '(5)', '/', 'FI', '(5)', '/', 'VBE', '(e)', '/', 'UST', '(e)', '/', 'TA', '(e)']
      >>> pos = ts.split().index("TA")
      >>> print pos
      12
      >>> val = ts.split()[pos + 1]
      >>> print val
      (e)
      >>> val[1:][:len(val) - 2]
      'e'

      Note the pos variable is set to 12. In Python lists, the first value in the list is position 0, so “TA” is position 12. To get the value after the “TA”, we just extract the value in position 13 (pos + 1). The last line just removes the first character and last character, removing the parenthesis and leaving the value you want.

      Have fun! -mike

      • Dear Mike

        Thank you so much for your answer! This works fantastically (I really had fun!).
        What I first forgot was to mark “string” in the field calculator, then afterwards it was no problem.
        In fact at the beginning I wasn’t sure if your solution would also extract numbers with two positions (e.g. 10, the value for 100%) or only extract one position, but it did.
        This helped me a lot, with your explanation also to understand the way such things are working!
        All the best

        alam

      • Hi Alam. I’m glad it worked for you. The last line with the length function len() takes into account the length of the value in the val variable, so if you have something like “(1)”, that is a length of 3, or “(10)” has a length of 4. The [1:] says only show me characters from position 1 on, which does not include the “(” because that is position 0. So if you had a “(1)” you would get “1)” and if you had a “(10)” you would get “10)”. Passing that next through [:len(val) – 2] says only show me characters up to BUT NOT INCLUDING position equal to the length of the value minus 2. So if you had a “(1)”, that is 3 – 2 = 1, so up to but not including position 1 in “1)” is just “1”, the value you want. If you had a “(10)”, that is 4 – 2 = 2, so up to but not including position 2 in “10)” is “10”. Cool eh? -mike

  16. Hi, been great reading so far and I love learning something new every day! I’ve a question to further the split tool.

    GREENFIELD, 2 DRUMMOND PLACE, CALLANDER, FK17 8ER
    19, DRUMMOND PLACE, CALLANDER, FK17 8ER

    The above addresses, as examples, I’m looking to strip out the detail as your examples given, into separate fields, however, I wish to do the following.

    I want “new field 1” to contain house name (text), “new field 2” to contain house number, “new field 3” street name.

    Calculated in “new field 1” If the first word is a number omit that and return a blank, if the first word is text then return that word. eg; for 19, for GREENFIELD.
    Calculated in “new field 2”, the first number word but not the text or the first instance of a number.
    Calculated in “new field 3” the street name.
    The same is then applied to the rest of the address data.

    Any ideas?

    Many thanks – a person that likes maps from Scotland.

    • saintuk,

      I’m a little confused on your request. If I am to understand your first field should not be a number but text from the first word in the address (the part before the first comma). Then the second field should be a number extracted from the section between the first comma and second comma? And the third field should be the street name, which is the part between the first and second comma but does not include a number if there. Right? This is confusing to me.

  17. I am using ArcDesktop 10.5.1 and I am trying to truncate a field. Basically I am trying to extract a portion of many parcel numbers.
    For example the parcel number is 1609240000001000 and i am trying to extract 092400 I have Python selected and I am using a String for my type since this field is a text field. I apologize, I am no expression specialist by any means. Any help would be very much appreciated.

    Bart

    • Hi Bart. The python expression to use in the field calculator is !FIELD![2:8] where FIELD is the field that contains the long parcel number. The [2:8] extracts character position 2 up to but not including 8. Remember that the first character position is 0 not 1, so for the first three characters 160 the 1 is at position 0 and the 0 is at position 2.

  18. Hello.

    I have something like this:

    1018785824_d6e5cb3b65_w.jpg

    I am trying to grab only this part: d6e5cb3b65

    Is there any idea?

  19. Hi Michael, can you help me please?
    i have this kind of values on a string field

    181800
    and i need replace with

    18:18:00

    thanks a lot

  20. Hi Mike,
    I am sure you can help me.

    I need to calculate new field for First Name and Last Name using a field tha is populated by “last name, first name”. Is there a way to select only the string before the comma and after the comma? my challange is that sometimes the first and last names have muluple words. Ex: FELIPE DE JESUS, LARA-DIAZ.

    I need this to be
    First Name: LARA-DIAZ
    Last Name: FELIPE DE JESUS

    Thanks!!

    • Hi Jessica. Yes you can. Just open the field calculator for the new field, select Python, then in the input area, enter the following:

      “{} {}”.format(!FIELD!.split(“,”)[1], !FIELD!.split(“,”)[0])

      where !FIELD! is the field that contains the last,first name data. The catch is that every record must have a comma in it. That should do it! -mike

  21. Mike,
    I’ve read through all your great answers to these questions and looked at other forums, but I still can’t figure this out:

    I just need to separate Zip and Zip+4 (text) that are in the same column, but some do not have a Zip+4:

    field name: SitusZIP
    90280
    91105
    91105-3357
    ..

    I created a new field for the Zip+4 and calculated the field using !SitusZIP!.split(“-“)[-1] which worked fine, but it will return the first portion of the Zip even if there is no “-“. Any help would be appreciated!

    Using: ArcPro 2.7.0 Python 3

    • Hi Riley. Your python code is correct. The quickest solution is just select all the zip codes that have a “-” in them first, then use the field calculator with your python code. Then you don’t have to deal with those zip codes that do not have the +4 portion in them! -mike

      • OR … a more elegant way if you want to select all records, use this in your field calculator:

        !SitusZIP!.split(“-“)[-1] if “-” in !SitusZIP! else “”

        If the field has a “-” it will execute your split, if not it will just calc a blank. -mike

  22. Hi Mike,

    Hoping you can help; found this via a Google search. Your blog is a great resource!

    I have fields with beginning address ranges with the values, for example:

    706
    12400
    15708

    I want to revise all to be “rounded down” to the nearest 100. So the values would be:
    700
    12400
    15700

    1. Do I first need an expression to select those that are not already rounded to the 100s place?

    2. Then do I need an expression to do the rounding? Or can these be combined in one fell swoop?

    There could also be some values that would need to be selected and rounded to the “tens” place.

    The file is numeric, “long.”

    Any help is appreciated!

      • Thanks, Mike. I also found this expression: int(round(!your_field!, -2)) and it seemed to work. Any thoughts?

      • Mike – Yes, that makes sense now that you say it, though since these are a beginning address ranges, most of them should be on the lower end and round down. However, there were a few outliers. I used the math.floor expression and it worked well. Thank you!

Leave a comment