Implementing Splunk 7(Third Edition)
上QQ阅读APP看书,第一时间看更新

Indexed fields versus extracted fields

When an event is written to an index, the raw text of the event is captured along with a set of indexed fields. The default indexed fields include host, sourcetype, source, and _time. There are distinct advantages and a few serious disadvantages to using indexed fields.

First, let's look at the advantages of an indexed field (we will actually discuss configuring indexed fields in Chapter 11, Configuring Splunk):

  • As an indexed field is stored in the index with the event itself, it is only calculated at index time, and in fact, can only be calculated once at index time.
  • It can make finding specific instances of common terms efficient. See the Indexed field case 1 - rare instances of a common term section as an example.
  • You can create new words to search against those which simply don't exist in the raw text or are embedded inside a word. See from the Indexed field case 2 - splitting words section to the Indexed field case 4 - slow requests section.
  • You can efficiently search for words in other indexed fields. See the Indexed field case 3 - application from source section.

Now for the disadvantages of an indexed field:

  • It is not retroactive. This is different from extracted fields where all events, past and present, will gain the newly-defined field if the pattern matches. This is the biggest disadvantage of indexed fields and has a few implications, as follows:
    • Only newly-indexed events will gain a newly-defined indexed field. If the pattern is wrong in certain cases, there is no practical way to apply the field to already-indexed events.
    • Likewise, if the log format changes, the indexed field may not be generated (or may be generated incorrectly).
  • It adds to the size of your index on disk.
  • It counts against your license.
  • Any changes will require a restart to be applied and disrupt data flow temporarily.
  • In most cases, the value of the field is already an indexed word, in which case creating an indexed field will likely have no benefit, except in the rare cases where that value is very common.

With the disadvantages out of our way, let's look at a few cases where an indexed field would improve search performances and then at one case where it would probably make no difference.