Topic 25 Tries “In 1959, (Edward) Fredkin recommended that BBN (Bolt,

28 Slides6.21 MB

Topic 25 Tries “In 1959, (Edward) Fredkin recommended that BBN (Bolt, Beranek and Newman, now BBN Technologies) purchase the very first PDP-1 to support research projects at BBN. The PDP-1 came with no software whatsoever. Fredkin wrote a PDP-1 assembler called FRAP (Free of Rules Assembly Program);” Tries were first described by René de la Briandais in File searching using variable length keys.

Clicker 1 8 How would you pronounce “Trie” A. “tree” B. “tri – ee” C. “try” D. “tiara” E. something else CS314 Tries 2

Tries aka Prefix Trees 8 Pronunciation: 8 From retrieval 8 Name coined by Computer Scientist Edward Fredkin 8 Retrieval so “tree” 8 but that is very confusing so most people pronounce it “try” CS314 Tries 3

Predictive Text and AutoComplete 8 Search engines and texting applications guess what you want after typing only a few characters CS314 Tries 4

AutoComplete 8 So do other programs such as IDEs CS314 Tries 5

Searching a Dictionary 8 How? 8 Could search a set for all values that start with the given prefix. 8 Naively O(N) (search the whole data structure). 8 Could improve if possible to do a binary search for prefix and then localize search to that location. CS314 Tries 6

Tries 8 A general tree 8 Root node (or possibly a list of root nodes) 8 Nodes can have many children – not a binary tree 8 In simplest form each node stores a character and a data structure (list?) to refer to its children 8 Stores all the words or phrases in a dictionary. 8 How? CS314 Tries 7

René de la Briandais Original Paper CS314 Tries 8

? Picture of a Dinosaur CS314 Tries 9

Fall 2022 - Ryan P. Created with Procreate: https://procreate.art/ CS314 Tries 10

Can CS314 Tries 11

Candy CS314 Tries 12

Fox CS314 Tries 13

Clicker 2 8 Is “fast” in the dictionary represented by this Trie? A. No B. Yes C. It depends CS314 Tries 14

Clicker 3 8 Is “fist” in the dictionary represented by this Trie? A. No B. Yes C. It depends CS314 Tries 15

Tries 8 Another example of a Trie 8 Each node stores: – A char – A boolean indicating if the string ending at that node is a word – A list of children CS314 Tries 16

Predictive Text and AutoComplete 8 As characters are entered we descend the Trie 8 and from the current node 8 we can descend to termin ators and leaves to see all possible words based on current prefix 8 b, e, e - bee, been, bees CS314 Tries 17

8 Stores words and phrases. Tries – other values possible, but typically Strings 8 The whole word or phrase is not actually stored in a single node. 8 rather the path in the tree represents the word.

Implementing a Trie public class Trie { private TNode root; private int size; // number of words private int numNodes; public Trie() { root new TNode(); numNodes 1; CS314 Tries 19

TNode Class private static class TNode { private boolean word; private char ch; private LinkedList TNode children; 8 Basic implementation uses a LinkedList of TNode objects for children 8 Other options? – ArrayList? – Something more exotic? CS314 Tries 20

Basic Operations 8 Adding a word to the Trie 8 Getting all words with given prefix 8 Demo in IDE CS314 Tries 21

Compressed Tries 8 Some words, especially long ones, lead to a chain of nodes with single child, followed by single child: s b e a r i l l d u o y y e t l o l c k p

Compressed Trie 8 Reduce number of nodes, by having nodes store Strings 8 A chain of single child followed by single child (followed by single child ) is compressed to a single node with that String 8 Does not have to be a chain that terminates in a leaf node – Can be an internal chain of nodes CS314 Tries 23

Original, Uncompressed s b e a r CS314 i l l d u y s e t l o l y c p k Tries 24

Compressed Version s b e ar id ll ell u sy to ck p y 8 fewer nodes compared to uncompressed version s–t–o–c-k CS314 Tries 25

Data Structures 8 Data structures we have studied – arrays, array based lists, linked lists, maps, sets, stacks, queues, trees, binary search trees, graphs, hash tables, red-black trees, priority queues, heaps, tries 8 Most program languages have some built in data structures, native or library 8 Must be familiar with performance of data structures – best learned by implementing them yourself CS314 Heaps 26

Data Structures 8 We have not covered every data structure Heaps http://en.wikipedia.org/wiki/List of data structures

Data Structures 8 deque, b-trees, quad-trees, binary space partition trees, skip list, sparse list, sparse matrix, union-find data structure, Bloom filters, AVL trees, 2-3-4 trees, and more! 8 Must be able to learn new and apply new data structures CS314 Heaps 28

Back to top button