Skip to content

Binary Search Tree

A binary search tree (BST) is a binary tree where for each node, all values in the left subtree are less than the node's value, and all values in the right subtree are greater than the node's value. This implementation provides efficient search, insertion, and deletion operations with average time complexity of O(log n) for balanced trees.

Features

  • Ordered structure - maintains sorted order of elements automatically
  • Efficient operations - O(log n) search, insert, delete for balanced trees
  • Range queries - find elements within a specific range efficiently
  • Min/Max operations - instant access to smallest and largest elements
  • K-th element - find k-th smallest or largest element
  • Floor/Ceiling - find closest elements below/above a given value
  • Validation - verify BST property maintenance
  • Iterator support - traverse elements in sorted order
  • Parent references - efficient successor/predecessor operations

Time Complexities

Operation Average Case Worst Case Description
Search O(log n) O(n) Find element by value
Insert O(log n) O(n) Add new element
Delete O(log n) O(n) Remove element
Find Min/Max O(log n) O(n) Get minimum/maximum
Range Query O(log n + k) O(n) Find elements in range
K-th Element O(log n) O(n) Find k-th smallest/largest
Floor/Ceiling O(log n) O(n) Find closest elements
Inorder Traversal O(n) O(n) Get sorted sequence
Validation O(n) O(n) Verify BST property

Space Complexity: O(n) for storage, O(h) for recursion stack where h is height

Basic Usage

Creating and Initializing

from algo.data_structs.trees.binary_search_tree import BinarySearchTree

# Create empty BST
bst = BinarySearchTree()
print(bst.is_empty())      # True
print(bst.size())          # 0
print(bst.height())        # -1

# Create BST with root value
bst = BinarySearchTree(10)
print(bst.is_empty())      # False
print(bst.size())          # 1
print(bst.root.data)       # 10

# BST is generic - works with any comparable type
int_bst = BinarySearchTree(42)
str_bst = BinarySearchTree("root")

BST Nodes and Parent References

from algo.data_structs.trees.binary_search_tree import BSTNode

# Create BST node
node = BSTNode(5)
print(node.data)           # 5
print(node.parent)         # None
print(node.is_leaf())      # True

# Nodes automatically maintain parent references
bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)

# Parent references are set automatically
left_child = bst.root.left
print(left_child.data)     # 5
print(left_child.parent.data)  # 10
print(left_child.is_left_child())  # True

right_child = bst.root.right
print(right_child.is_right_child())  # True

Building Trees

# Insert elements to build BST
bst = BinarySearchTree(10)

# Insert maintains BST property automatically
bst.insert(5)    # Goes to left (5 < 10)
bst.insert(15)   # Goes to right (15 > 10)
bst.insert(3)    # Goes to left of 5 (3 < 5)
bst.insert(7)    # Goes to right of 5 (7 > 5)
bst.insert(12)   # Goes to left of 15 (12 < 15)
bst.insert(20)   # Goes to right of 15 (20 > 15)

print(bst.size())          # 7
print(bst.height())        # 2

# Tree structure automatically maintains BST property:
#       10
#      /  \
#     5    15
#    / \   / \
#   3   7 12  20

# Get sorted sequence with inorder traversal
print(bst.inorder_traversal())  # [3, 5, 7, 10, 12, 15, 20]

BST Property Enforcement

# BST doesn't allow arbitrary insertions like regular binary tree
bst = BinarySearchTree(10)

# These methods are disabled in BST to maintain ordering
try:
    bst.insert_left(10, 5)    # Raises NotImplementedError
except NotImplementedError as e:
    print("BST maintains ordering - use insert() method instead")

try:
    bst.insert_right(10, 15)  # Raises NotImplementedError
except NotImplementedError as e:
    print("BST maintains ordering - use insert() method instead")

# Use insert() method instead - maintains BST property
bst.insert(5)    # Automatically goes to correct position
bst.insert(15)   # Automatically goes to correct position

Search Operations

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

# BST search is efficient - O(log n) average case
assert bst.search(10) is not None  # Returns the node
assert bst.search(5) is not None
assert bst.search(99) is None

# Multiple ways to check existence
print(bst.contains(7))     # True
print(bst.find(12))        # True (inherited from BinaryTree, uses BST search)
print(20 in bst)           # True (uses __contains__)
print(99 in bst)           # False

# Search returns the actual node
node = bst.search(15)
print(node.data)           # 15
print(node.left.data)      # 12
print(node.right.data)     # 20

Find Minimum and Maximum

# Efficient min/max operations - O(log n)
print(bst.find_min())      # 3 (leftmost node)
print(bst.find_max())      # 20 (rightmost node)

# Works on empty tree
empty_bst = BinarySearchTree()
print(empty_bst.find_min())  # None
print(empty_bst.find_max())  # None

# Single node tree
single_bst = BinarySearchTree(42)
print(single_bst.find_min())  # 42
print(single_bst.find_max())  # 42

Element Insertion

Maintaining BST Property

bst = BinarySearchTree()

# Insert elements in any order - BST maintains sorted structure
elements = [50, 30, 70, 20, 40, 60, 80, 10, 25, 35, 45]
for element in elements:
    success = bst.insert(element)
    print(f"Inserted {element}: {success}")

# Inorder traversal always gives sorted sequence
print(bst.get_sorted_list())  # [10, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80]

# Tree automatically balanced based on insertion order
print(f"Size: {bst.size()}")     # 11
print(f"Height: {bst.height()}")  # Depends on insertion order

Handling Duplicates

bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)

# BST doesn't allow duplicates
assert bst.insert(10) == False  # Duplicate not inserted
assert bst.insert(5) == False   # Duplicate not inserted
assert bst.insert(15) == False  # Duplicate not inserted

print(bst.size())  # Still 3 - duplicates not added

# Only unique values are stored
print(bst.get_sorted_list())  # [5, 10, 15]

Insert Return Values

bst = BinarySearchTree()

# insert() returns True for successful insertion
assert bst.insert(10) == True   # New element
assert bst.insert(5) == True    # New element
assert bst.insert(10) == False  # Duplicate

# Use return value to check if insertion was successful
values = [10, 5, 15, 5, 20, 10]
unique_count = 0
for value in values:
    if bst.insert(value):
        unique_count += 1

print(f"Unique values inserted: {unique_count}")  # 4
print(f"BST size: {bst.size()}")                  # 4

Element Deletion

Delete Leaf Nodes

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

print(f"Initial size: {bst.size()}")  # 7

# Delete leaf nodes (simplest case)
assert bst.delete(3) == True   # Leaf node deleted
assert bst.delete(7) == True   # Leaf node deleted
assert bst.delete(12) == True  # Leaf node deleted
assert bst.delete(20) == True  # Leaf node deleted

print(f"After deleting leaves: {bst.size()}")  # 3
print(bst.get_sorted_list())                   # [5, 10, 15]

# Tree structure maintained
assert bst.is_valid_bst()

Delete Nodes with One Child

# Create tree where some nodes have one child
bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)
bst.insert(20)  # 15 has only right child

# Delete node with one child
assert bst.delete(15) == True
print(bst.get_sorted_list())  # [5, 10, 20]

# Child takes parent's place
assert bst.root.right.data == 20
assert bst.root.right.parent == bst.root
assert bst.is_valid_bst()

Delete Nodes with Two Children

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

# Delete node with two children (most complex case)
assert bst.delete(5) == True   # Node with two children

# BST property maintained using inorder successor
print(bst.get_sorted_list())   # [3, 7, 10, 12, 15, 20]
assert bst.is_valid_bst()

# Delete root (also has two children)
assert bst.delete(10) == True
print(bst.get_sorted_list())   # Still sorted
assert bst.is_valid_bst()

Delete Operations Edge Cases

# Delete from empty tree
empty_bst = BinarySearchTree()
assert empty_bst.delete(10) == False

# Delete non-existent element
bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)
assert bst.delete(99) == False  # Doesn't exist
assert bst.size() == 3          # Size unchanged

# Delete root of single-node tree
single_bst = BinarySearchTree(42)
assert single_bst.delete(42) == True
assert single_bst.is_empty()

Traversals and Iteration

Sorted Traversal

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

# Inorder traversal gives sorted sequence (key BST property)
sorted_values = bst.inorder_traversal()
print(sorted_values)  # [3, 5, 7, 10, 12, 15, 20]

# Equivalent method
sorted_values = bst.get_sorted_list()
print(sorted_values)  # [3, 5, 7, 10, 12, 15, 20]

# BST is iterable - returns sorted order
for value in bst:
    print(value)  # Prints: 3, 5, 7, 10, 12, 15, 20

# Convert to sorted list
sorted_list = list(bst)
print(sorted_list)  # [3, 5, 7, 10, 12, 15, 20]

Other Traversal Methods

# Other traversals inherited from BinaryTree
print("Preorder: ", bst.preorder_traversal())   # [10, 5, 3, 7, 15, 12, 20]
print("Postorder:", bst.postorder_traversal())  # [3, 7, 5, 12, 20, 15, 10]
print("Level-order:", bst.level_order_traversal())  # [10, 5, 15, 3, 7, 12, 20]

# Preorder useful for tree reconstruction
# Postorder useful for tree deletion
# Level-order useful for breadth-first operations

K-th Element Operations

Finding K-th Smallest

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20, 1]:
    bst.insert(value)

# Sorted order: [1, 3, 5, 7, 10, 12, 15, 20]

# Find k-th smallest (1-indexed)
print(bst.find_kth_smallest(1))  # 1 (smallest)
print(bst.find_kth_smallest(2))  # 3 (second smallest)
print(bst.find_kth_smallest(4))  # 7 (fourth smallest)
print(bst.find_kth_smallest(8))  # 20 (largest)

# Out of bounds returns None
print(bst.find_kth_smallest(0))   # None (invalid k)
print(bst.find_kth_smallest(9))   # None (k > size)
print(bst.find_kth_smallest(-1))  # None (negative k)

Finding K-th Largest

# Find k-th largest (1-indexed from the end)
print(bst.find_kth_largest(1))  # 20 (largest)
print(bst.find_kth_largest(2))  # 15 (second largest)
print(bst.find_kth_largest(4))  # 10 (fourth largest)
print(bst.find_kth_largest(8))  # 1 (smallest)

# Out of bounds returns None
print(bst.find_kth_largest(0))   # None
print(bst.find_kth_largest(9))   # None

Practical K-th Element Usage

def find_median(bst):
    """Find median value in BST."""
    size = bst.size()
    if size == 0:
        return None

    if size % 2 == 1:
        # Odd size - middle element
        return bst.find_kth_smallest((size + 1) // 2)
    else:
        # Even size - average of two middle elements
        mid1 = bst.find_kth_smallest(size // 2)
        mid2 = bst.find_kth_smallest(size // 2 + 1)
        return (mid1 + mid2) / 2

# Example usage
numbers_bst = BinarySearchTree()
for num in [10, 5, 15, 3, 7, 12, 20]:
    numbers_bst.insert(num)

median = find_median(numbers_bst)
print(f"Median: {median}")  # 10.0 (middle of [3, 5, 7, 10, 12, 15, 20])

Range Queries

Basic Range Queries

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20, 1, 25]:
    bst.insert(value)

# Find all values in range [min_val, max_val] (inclusive)
result = bst.range_query(5, 15)
print(result)  # [5, 7, 10, 12, 15] - sorted order

# Partial ranges
result = bst.range_query(8, 18)
print(result)  # [10, 12, 15]

# Single value range
result = bst.range_query(10, 10)
print(result)  # [10]

# Empty ranges
result = bst.range_query(8, 9)   # No values in range
print(result)  # []

result = bst.range_query(30, 40)  # Range beyond max
print(result)  # []

Range Query Applications

def count_in_range(bst, min_val, max_val):
    """Count elements in range."""
    return len(bst.range_query(min_val, max_val))

def sum_in_range(bst, min_val, max_val):
    """Sum elements in range."""
    return sum(bst.range_query(min_val, max_val))

# Example: grade analysis
grades_bst = BinarySearchTree()
grades = [85, 92, 78, 96, 88, 73, 91, 82, 87, 94]
for grade in grades:
    grades_bst.insert(grade)

# Count students with grade A (90-100)
a_count = count_in_range(grades_bst, 90, 100)
print(f"Grade A students: {a_count}")  # 4

# Count students with grade B (80-89)
b_count = count_in_range(grades_bst, 80, 89)
print(f"Grade B students: {b_count}")  # 4

# Average of passing grades (70+)
passing_grades = grades_bst.range_query(70, 100)
average = sum(passing_grades) / len(passing_grades)
print(f"Passing grade average: {average:.1f}")

Floor and Ceiling Operations

Floor Operation

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

# Floor: largest value <= given value
print(bst.floor(10))  # 10 (exact match)
print(bst.floor(8))   # 7 (largest value <= 8)
print(bst.floor(25))  # 20 (largest value <= 25)
print(bst.floor(2))   # None (no value <= 2)

# Floor useful for "at most" queries
def largest_at_most(bst, limit):
    """Find largest value that doesn't exceed limit."""
    return bst.floor(limit)

print(largest_at_most(bst, 13))  # 12
print(largest_at_most(bst, 6))   # 5

Ceiling Operation

# Ceiling: smallest value >= given value
print(bst.ceiling(10))  # 10 (exact match)
print(bst.ceiling(8))   # 10 (smallest value >= 8)
print(bst.ceiling(2))   # 3 (smallest value >= 2)
print(bst.ceiling(25))  # None (no value >= 25)

# Ceiling useful for "at least" queries
def smallest_at_least(bst, minimum):
    """Find smallest value that meets minimum requirement."""
    return bst.ceiling(minimum)

print(smallest_at_least(bst, 6))   # 7
print(smallest_at_least(bst, 13))  # 15

Floor/Ceiling Applications

def find_closest_values(bst, target):
    """Find values closest to target (one below, one above)."""
    floor_val = bst.floor(target)
    ceiling_val = bst.ceiling(target)

    result = {}
    if floor_val is not None:
        result['below'] = floor_val
    if ceiling_val is not None:
        result['above'] = ceiling_val

    return result

# Example: price lookup
prices_bst = BinarySearchTree()
prices = [10.99, 15.50, 8.25, 22.00, 12.75, 18.90]
for price in prices:
    prices_bst.insert(price)

# Find prices around budget
budget = 14.00
closest = find_closest_values(prices_bst, budget)
print(f"Budget: ${budget}")
print(f"Closest below: ${closest.get('below', 'None')}")    # $12.75
print(f"Closest above: ${closest.get('above', 'None')}")    # $15.50

Tree Validation and Properties

BST Property Validation

bst = BinarySearchTree(10)
for value in [5, 15, 3, 7, 12, 20]:
    bst.insert(value)

# Verify BST property is maintained
assert bst.is_valid_bst()  # True

# All operations maintain BST property
bst.delete(5)
assert bst.is_valid_bst()  # Still True

bst.insert(8)
assert bst.is_valid_bst()  # Still True

# Empty and single-node trees are valid
empty_bst = BinarySearchTree()
assert empty_bst.is_valid_bst()  # True

single_bst = BinarySearchTree(42)
assert single_bst.is_valid_bst()  # True

Node Relationships

bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)
bst.insert(3)
bst.insert(7)

# Access nodes and their relationships
root = bst.root
left_child = root.left
right_child = root.right

# Parent-child relationships
print(f"Root: {root.data}")                    # 10
print(f"Left child: {left_child.data}")        # 5
print(f"Right child: {right_child.data}")      # 15

# Parent references
print(f"Left child's parent: {left_child.parent.data}")   # 10
print(f"Right child's parent: {right_child.parent.data}") # 10

# Position checks
print(f"Left child is left: {left_child.is_left_child()}")    # True
print(f"Right child is right: {right_child.is_right_child()}")  # True

# Min/max in subtrees
print(f"Min in left subtree: {left_child.get_min().data}")   # 3
print(f"Max in left subtree: {left_child.get_max().data}")   # 7

Successor and Predecessor

# Find inorder successor and predecessor
node_7 = bst.search(7)
node_5 = bst.search(5)
node_3 = bst.search(3)

# Successor (next larger element)
print(f"Successor of 7: {node_7.inorder_successor().data}")  # 10
print(f"Successor of 5: {node_5.inorder_successor().data}")  # 7
print(f"Successor of 3: {node_3.inorder_successor().data}")  # 5

# Predecessor (next smaller element)  
print(f"Predecessor of 7: {node_7.inorder_predecessor().data}")  # 5
print(f"Predecessor of 5: {node_5.inorder_predecessor().data}")  # 3
print(f"Predecessor of 3: {node_3.inorder_predecessor()}")       # None (smallest)

# Useful for range iterations and ordered operations

Working with Different Types

String BST

# BST with string data - uses lexicographic ordering
str_bst = BinarySearchTree("medium")
words = ["apple", "zebra", "banana", "orange", "grape"]

for word in words:
    str_bst.insert(word)

# Sorted alphabetically
print(str_bst.get_sorted_list())  
# ['apple', 'banana', 'grape', 'medium', 'orange', 'zebra']

# String operations
print(f"First word: {str_bst.find_min()}")     # 'apple'
print(f"Last word: {str_bst.find_max()}")      # 'zebra'

# Range query with strings
words_m_to_p = str_bst.range_query("m", "p")
print(words_m_to_p)  # ['medium', 'orange']

# Floor/ceiling with strings
print(str_bst.floor("cherry"))    # 'banana'
print(str_bst.ceiling("cherry"))  # 'grape'

Numeric BST (Floats)

# BST with floating-point numbers
float_bst = BinarySearchTree(5.5)
values = [2.3, 7.8, 1.1, 9.9, 4.4, 6.6]

for value in values:
    float_bst.insert(value)

print(float_bst.get_sorted_list())  
# [1.1, 2.3, 4.4, 5.5, 6.6, 7.8, 9.9]

# Precise range queries
result = float_bst.range_query(2.0, 8.0)
print(result)  # [2.3, 4.4, 5.5, 6.6, 7.8]

# Find closest values to a target
target = 5.0
floor_val = float_bst.floor(target)    # 4.4
ceiling_val = float_bst.ceiling(target)  # 5.5
print(f"Closest to {target}: {floor_val} and {ceiling_val}")

Custom Object BST

class Person:
    def __init__(self, name: str, age: int):
        self.name = name
        self.age = age

    def __lt__(self, other):
        return self.age < other.age

    def __eq__(self, other):
        return isinstance(other, Person) and self.age == other.age

    def __repr__(self):
        return f"Person({self.name}, {self.age})"

# BST ordered by age
people_bst = BinarySearchTree(Person("Alice", 30))
people = [
    Person("Bob", 25),
    Person("Charlie", 35),
    Person("Diana", 28),
    Person("Eve", 32)
]

for person in people:
    people_bst.insert(person)

# Operations work with custom objects
print("People by age:")
for person in people_bst:
    print(f"  {person}")

print(f"Youngest: {people_bst.find_min()}")
print(f"Oldest: {people_bst.find_max()}")

# Find people in age range
twenties = people_bst.range_query(Person("", 20), Person("", 29))
print(f"People in 20s: {twenties}")

Tree Display and Debugging

String Representation

# Empty BST
empty_bst = BinarySearchTree()
print(str(empty_bst))  # "Empty BST"

# Non-empty BST
bst = BinarySearchTree(10)
for value in [5, 15, 3, 7]:
    bst.insert(value)

print(bst)
# Output:
# Binary Search Tree (size: 5, height: 2)
# Root: 10
# Inorder traversal: [3, 5, 7, 10, 15]

# Detailed representation for debugging
print(repr(bst))  # BinarySearchTree(root=10, size=5)

Tree Information Display

def print_bst_info(bst, name="BST"):
    """Print comprehensive BST information."""
    print(f"\n{name} Information:")
    print(f"  Size: {bst.size()}")
    print(f"  Height: {bst.height()}")
    print(f"  Is empty: {bst.is_empty()}")
    print(f"  Is valid BST: {bst.is_valid_bst()}")
    print(f"  Is balanced: {bst.is_balanced()}")

    if not bst.is_empty():
        print(f"  Min value: {bst.find_min()}")
        print(f"  Max value: {bst.find_max()}")
        print(f"  Root: {bst.root.data}")
        print(f"  Sorted values: {bst.get_sorted_list()}")

# Example usage
bst = BinarySearchTree(50)
for value in [30, 70, 20, 40, 60, 80]:
    bst.insert(value)

print_bst_info(bst, "Sample BST")

Utility Functions

Building Balanced BST

from algo.data_structs.trees.binary_search_tree import build_bst_from_sorted_array

# Build balanced BST from sorted array
sorted_values = [1, 3, 5, 7, 9, 11, 13, 15]
balanced_bst = build_bst_from_sorted_array(sorted_values)

print(f"Size: {balanced_bst.size()}")           # 8
print(f"Height: {balanced_bst.height()}")       # 3 (log n)
print(f"Is balanced: {balanced_bst.is_balanced()}")  # True
print(f"Root: {balanced_bst.root.data}")        # 8 (middle element)

# Verify it's a valid BST
assert balanced_bst.is_valid_bst()
assert balanced_bst.get_sorted_list() == sorted_values

Merging BSTs

from algo.data_structs.trees.binary_search_tree import merge_bsts

# Create two BSTs
bst1 = BinarySearchTree(5)
for value in [3, 7, 1, 4]:
    bst1.insert(value)

bst2 = BinarySearchTree(10)
for value in [8, 12, 6, 14]:
    bst2.insert(value)

# Merge them into a balanced BST
merged_bst = merge_bsts(bst1, bst2)

print(f"BST1: {bst1.get_sorted_list()}")      # [1, 3, 4, 5, 7]
print(f"BST2: {bst2.get_sorted_list()}")      # [6, 8, 10, 12, 14]
print(f"Merged: {merged_bst.get_sorted_list()}")  # [1, 3, 4, 5, 6, 7, 8, 10, 12, 14]

assert merged_bst.is_valid_bst()
assert merged_bst.is_balanced()

# Handles duplicates automatically
bst_with_dups = BinarySearchTree(5)
bst_with_dups.insert(3)
bst_with_dups.insert(7)

merged_with_dups = merge_bsts(bst1, bst_with_dups)
print(f"Merged with duplicates: {merged_with_dups.get_sorted_list()}")
# Duplicates are removed: [1, 3, 4, 5, 7]

Common Patterns

BST as Priority Queue

class Task:
    def __init__(self, name: str, priority: int):
        self.name = name
        self.priority = priority

    def __lt__(self, other):
        return self.priority < other.priority  # Lower number = higher priority

    def __eq__(self, other):
        return isinstance(other, Task) and self.priority == other.priority

    def __repr__(self):
        return f"Task({self.name}, p={self.priority})"

# Use BST for task management
task_bst = BinarySearchTree()
tasks = [
    Task("urgent", 1),
    Task("normal", 5),
    Task("low", 10),
    Task("critical", 0),
    Task("medium", 3)
]

for task in tasks:
    task_bst.insert(task)

print("Tasks by priority:")
for task in task_bst:  # Sorted by priority
    print(f"  {task}")

# Get highest priority task
highest_priority = task_bst.find_min()
print(f"Next task: {highest_priority}")

# Get tasks with priority <= 3
urgent_tasks = task_bst.range_query(Task("", 0), Task("", 3))
print(f"Urgent tasks: {urgent_tasks}")

Range-based Operations

def analyze_scores(scores):
    """Analyze test scores using BST."""
    bst = BinarySearchTree()
    for score in scores:
        bst.insert(score)

    total_scores = bst.size()
    min_score = bst.find_min()
    max_score = bst.find_max()

    # Grade distribution
    grade_a = len(bst.range_query(90, 100))
    grade_b = len(bst.range_query(80, 89))
    grade_c = len(bst.range_query(70, 79))
    grade_d = len(bst.range_query(60, 69))
    grade_f = len(bst.range_query(0, 59))

    # Percentiles
    median = bst.find_kth_smallest((total_scores + 1) // 2)
    q1 = bst.find_kth_smallest((total_scores + 1) // 4)
    q3 = bst.find_kth_smallest(3 * (total_scores + 1) // 4)

    return {
        'total': total_scores,
        'min': min_score,
        'max': max_score,
        'median': median,
        'q1': q1,
        'q3': q3,
        'grades': {'A': grade_a, 'B': grade_b, 'C': grade_c, 'D': grade_d, 'F': grade_f}
    }

# Example usage
scores = [85, 92, 78, 96, 88, 73, 91, 82, 87, 94, 76, 89]
analysis = analyze_scores(scores)

print("Score Analysis:")
print(f"  Range: {analysis['min']} - {analysis['max']}")
print(f"  Median: {analysis['median']}")
print(f"  Q1: {analysis['q1']}, Q3: {analysis['q3']}")
print(f"  Grade distribution: {analysis['grades']}")

Autocomplete System

def build_autocomplete_bst(words):
    """Build autocomplete system using BST."""
    bst = BinarySearchTree()
    for word in words:
        bst.insert(word.lower())
    return bst

def autocomplete(bst, prefix):
    """Find all words with given prefix."""
    prefix = prefix.lower()

    # Find start of range (ceiling of prefix)
    start = bst.ceiling(prefix)
    if not start or not start.startswith(prefix):
        return []

    # Find end of range (floor of prefix + 'z')
    end_prefix = prefix[:-1] + chr(ord(prefix[-1]) + 1) if prefix else 'z'
    end = bst.floor(end_prefix + 'z' * 10)  # Add padding

    if not end:
        # Get all words from start to max
        all_words = bst.get_sorted_list()
        start_idx = all_words.index(start)
        candidates = all_words[start_idx:]
    else:
        # Get words in range
        candidates = bst.range_query(start, end)

    # Filter to only words with prefix
    return [word for word in candidates if word.startswith(prefix)]

# Example usage
words = [
    "apple", "application", "apply", "appreciate", "approach",
    "banana", "band", "bank", "bar", "base",
    "cat", "car", "card", "care", "carry"
]

autocomplete_bst = build_autocomplete_bst(words)

# Test autocomplete
print("Autocomplete for 'app':")
suggestions = autocomplete(autocomplete_bst, "app")
for suggestion in suggestions:
    print(f"  {suggestion}")

print("\nAutocomplete for 'ca':")
suggestions = autocomplete(autocomplete_bst, "ca")
for suggestion in suggestions:
    print(f"  {suggestion}")

Database Index Simulation

class Record:
    def __init__(self, id: int, name: str, age: int):
        self.id = id
        self.name = name
        self.age = age

    def __lt__(self, other):
        return self.id < other.id

    def __eq__(self, other):
        return isinstance(other, Record) and self.id == other.id

    def __repr__(self):
        return f"Record(id={self.id}, name='{self.name}', age={self.age})"

class SimpleDatabase:
    def __init__(self):
        self.id_index = BinarySearchTree()  # Primary index by ID
        self.age_index = BinarySearchTree() # Secondary index by age

    def insert_record(self, record):
        """Insert record with indexing."""
        # Insert in primary index
        success = self.id_index.insert(record)
        if success:
            # Create age index entry
            age_record = Record(record.age, record.name, record.id)
            self.age_index.insert(age_record)
        return success

    def find_by_id(self, record_id):
        """Find record by ID."""
        search_record = Record(record_id, "", 0)
        node = self.id_index.search(search_record)
        return node.data if node else None

    def find_by_age_range(self, min_age, max_age):
        """Find all records in age range."""
        min_record = Record(min_age, "", 0)
        max_record = Record(max_age, "", 0)
        age_records = self.age_index.range_query(min_record, max_record)

        # Get full records by ID
        return [self.find_by_id(rec.age) for rec in age_records]

    def get_youngest(self):
        """Get youngest person."""
        min_age_record = self.age_index.find_min()
        if min_age_record:
            return self.find_by_id(min_age_record.age)
        return None

    def get_oldest(self):
        """Get oldest person."""
        max_age_record = self.age_index.find_max()
        if max_age_record:
            return self.find_by_id(max_age_record.age)
        return None

# Example usage
db = SimpleDatabase()

# Insert records
records = [
    Record(1001, "Alice", 25),
    Record(1002, "Bob", 30),
    Record(1003, "Charlie", 22),
    Record(1004, "Diana", 28),
    Record(1005, "Eve", 35)
]

for record in records:
    db.insert_record(record)

# Query operations
print("Find by ID 1003:", db.find_by_id(1003))
print("People aged 25-30:", db.find_by_age_range(25, 30))
print("Youngest person:", db.get_youngest())
print("Oldest person:", db.get_oldest())

Practical Applications

Event Scheduling System

from datetime import datetime, timedelta

class Event:
    def __init__(self, name: str, start_time: datetime, duration: int):
        self.name = name
        self.start_time = start_time
        self.duration = duration  # minutes
        self.end_time = start_time + timedelta(minutes=duration)

    def __lt__(self, other):
        return self.start_time < other.start_time

    def __eq__(self, other):
        return isinstance(other, Event) and self.start_time == other.start_time

    def __repr__(self):
        return f"Event('{self.name}', {self.start_time.strftime('%H:%M')})"

class EventScheduler:
    def __init__(self):
        self.events = BinarySearchTree()

    def add_event(self, event):
        """Add event to schedule."""
        return self.events.insert(event)

    def get_schedule(self):
        """Get events in chronological order."""
        return list(self.events)

    def find_events_in_timeframe(self, start, end):
        """Find events within time range."""
        start_event = Event("", start, 0)
        end_event = Event("", end, 0)
        return self.events.range_query(start_event, end_event)

    def get_next_event(self, current_time):
        """Find next event after current time."""
        search_event = Event("", current_time, 0)
        return self.events.ceiling(search_event)

    def check_conflicts(self, new_event):
        """Check for scheduling conflicts."""
        # Find events that might overlap
        window_start = new_event.start_time - timedelta(minutes=new_event.duration)
        window_end = new_event.end_time

        candidates = self.find_events_in_timeframe(window_start, window_end)

        conflicts = []
        for event in candidates:
            # Check for actual overlap
            if (new_event.start_time < event.end_time and 
                new_event.end_time > event.start_time):
                conflicts.append(event)

        return conflicts

# Example usage
scheduler = EventScheduler()
base_time = datetime(2024, 1, 15, 9, 0)  # 9:00 AM

# Add events
events = [
    Event("Morning Meeting", base_time, 60),
    Event("Code Review", base_time + timedelta(hours=2), 45),
    Event("Lunch", base_time + timedelta(hours=3), 60),
    Event("Design Session", base_time + timedelta(hours=5), 90),
    Event("Team Standup", base_time + timedelta(hours=7), 30)
]

for event in events:
    scheduler.add_event(event)

# Query schedule
print("Today's Schedule:")
for event in scheduler.get_schedule():
    print(f"  {event.start_time.strftime('%H:%M')} - {event.name}")

# Find next event
current = base_time + timedelta(hours=1, minutes=30)  # 10:30 AM
next_event = scheduler.get_next_event(current)
print(f"\nNext event after {current.strftime('%H:%M')}: {next_event}")

# Check for conflicts
new_event = Event("Emergency Meeting", base_time + timedelta(hours=2, minutes=30), 30)
conflicts = scheduler.check_conflicts(new_event)
if conflicts:
    print(f"\nConflict detected for {new_event}:")
    for conflict in conflicts:
        print(f"  Conflicts with: {conflict}")
else:
    print(f"\nNo conflicts for {new_event}")

Price Tracking System

class PricePoint:
    def __init__(self, price: float, timestamp: datetime, volume: int = 0):
        self.price = price
        self.timestamp = timestamp
        self.volume = volume

    def __lt__(self, other):
        return self.price < other.price

    def __eq__(self, other):
        return isinstance(other, PricePoint) and abs(self.price - other.price) < 0.01

    def __repr__(self):
        return f"${self.price:.2f}"

class PriceTracker:
    def __init__(self, symbol: str):
        self.symbol = symbol
        self.prices = BinarySearchTree()

    def add_price(self, price, timestamp, volume=0):
        """Add price point."""
        point = PricePoint(price, timestamp, volume)
        return self.prices.insert(point)

    def get_current_range(self):
        """Get current price range."""
        if self.prices.is_empty():
            return None, None
        return self.prices.find_min().price, self.prices.find_max().price

    def find_support_resistance(self, current_price):
        """Find support (below) and resistance (above) levels."""
        search_point = PricePoint(current_price, datetime.now())

        support = self.prices.floor(search_point)
        resistance = self.prices.ceiling(search_point)

        return (support.price if support else None, 
                resistance.price if resistance else None)

    def get_prices_in_range(self, min_price, max_price):
        """Get all recorded prices in range."""
        min_point = PricePoint(min_price, datetime.now())
        max_point = PricePoint(max_price, datetime.now())

        price_points = self.prices.range_query(min_point, max_point)
        return [point.price for point in price_points]

    def calculate_percentiles(self):
        """Calculate price percentiles."""
        size = self.prices.size()
        if size == 0:
            return {}

        return {
            'min': self.prices.find_min().price,
            'p25': self.prices.find_kth_smallest(size // 4 + 1).price if size >= 4 else None,
            'median': self.prices.find_kth_smallest((size + 1) // 2).price,
            'p75': self.prices.find_kth_largest(size // 4 + 1).price if size >= 4 else None,
            'max': self.prices.find_max().price
        }

# Example usage
tracker = PriceTracker("AAPL")

# Add historical prices
prices = [150.25, 152.80, 148.90, 155.60, 147.30, 153.40, 149.70, 156.80, 151.20]
base_time = datetime.now() - timedelta(days=len(prices))

for i, price in enumerate(prices):
    timestamp = base_time + timedelta(days=i)
    tracker.add_price(price, timestamp, volume=1000 + i * 100)

# Analysis
min_price, max_price = tracker.get_current_range()
print(f"{tracker.symbol} Price Range: ${min_price:.2f} - ${max_price:.2f}")

# Current market analysis
current_price = 152.00
support, resistance = tracker.find_support_resistance(current_price)
print(f"Current Price: ${current_price:.2f}")
print(f"Support Level: ${support:.2f}" if support else "No support found")
print(f"Resistance Level: ${resistance:.2f}" if resistance else "No resistance found")

# Price distribution
percentiles = tracker.calculate_percentiles()
print(f"\nPrice Percentiles:")
for level, price in percentiles.items():
    if price:
        print(f"  {level}: ${price:.2f}")

# Find prices in specific range
range_prices = tracker.get_prices_in_range(150.00, 155.00)
print(f"\nPrices between $150-$155: {[f'${p:.2f}' for p in range_prices]}")

Leaderboard System

class Player:
    def __init__(self, name: str, score: int, level: int = 1):
        self.name = name
        self.score = score
        self.level = level

    def __lt__(self, other):
        # Higher score is "less" for reverse order in leaderboard
        if self.score != other.score:
            return self.score > other.score
        return self.name < other.name  # Tiebreaker by name

    def __eq__(self, other):
        return (isinstance(other, Player) and 
                self.score == other.score and 
                self.name == other.name)

    def __repr__(self):
        return f"Player({self.name}, {self.score})"

class Leaderboard:
    def __init__(self, max_size: int = 100):
        self.players = BinarySearchTree()
        self.max_size = max_size

    def add_player(self, player):
        """Add player to leaderboard."""
        return self.players.insert(player)

    def update_score(self, name: str, new_score: int):
        """Update player's score."""
        # Find and remove old entry
        old_player = None
        for player in self.players:
            if player.name == name:
                old_player = player
                break

        if old_player:
            self.players.delete(old_player)

        # Add updated player
        updated_player = Player(name, new_score, old_player.level if old_player else 1)
        return self.players.insert(updated_player)

    def get_top_players(self, n: int = 10):
        """Get top N players."""
        all_players = list(self.players)  # Already sorted by BST iteration
        return all_players[:min(n, len(all_players))]

    def get_player_rank(self, name: str):
        """Get player's current rank."""
        all_players = list(self.players)
        for i, player in enumerate(all_players):
            if player.name == name:
                return i + 1  # 1-indexed rank
        return None

    def get_players_in_score_range(self, min_score: int, max_score: int):
        """Get players with scores in range."""
        # Note: scores are stored in reverse order (higher first)
        min_player = Player("", max_score)  # Reverse logic
        max_player = Player("", min_score)

        return self.players.range_query(min_player, max_player)

    def get_percentile_score(self, percentile: int):
        """Get score at given percentile (1-100)."""
        size = self.players.size()
        if size == 0:
            return None

        # Convert percentile to rank (higher percentile = better rank)
        rank = max(1, int(size * (100 - percentile) / 100))
        player = self.players.find_kth_smallest(rank)
        return player.score if player else None

# Example usage
leaderboard = Leaderboard()

# Add players
players_data = [
    ("Alice", 15420),
    ("Bob", 12350),
    ("Charlie", 18900),
    ("Diana", 16780),
    ("Eve", 14200),
    ("Frank", 11890),
    ("Grace", 17650),
    ("Henry", 13450),
    ("Iris", 19200),
    ("Jack", 10980)
]

for name, score in players_data:
    player = Player(name, score)
    leaderboard.add_player(player)

# Display top players
print("Top 5 Players:")
top_players = leaderboard.get_top_players(5)
for i, player in enumerate(top_players, 1):
    print(f"  {i}. {player.name}: {player.score:,}")

# Player rank lookup
player_name = "Diana"
rank = leaderboard.get_player_rank(player_name)
print(f"\n{player_name}'s rank: #{rank}")

# Score analysis
p90_score = leaderboard.get_percentile_score(90)
p50_score = leaderboard.get_percentile_score(50)
print(f"\n90th percentile score: {p90_score:,}")
print(f"Median score: {p50_score:,}")

# Update a player's score
leaderboard.update_score("Bob", 20500)
print(f"\nAfter Bob's score update:")
new_rank = leaderboard.get_player_rank("Bob")
print(f"Bob's new rank: #{new_rank}")

# Find competitive players
competitive_players = leaderboard.get_players_in_score_range(15000, 20000)
print(f"\nPlayers with 15K-20K scores:")
for player in competitive_players:
    print(f"  {player.name}: {player.score:,}")

Use Cases

When to Use Binary Search Trees

Good for:

  • Maintaining sorted data with frequent insertions/deletions
  • Range queries and ordered traversals
  • Finding min/max efficiently
  • K-th smallest/largest element queries
  • Floor/ceiling operations
  • Database indexing and searching
  • Autocomplete systems
  • Leaderboards and rankings

Examples:

# Maintaining sorted collection
scores_bst = BinarySearchTree()
for score in [85, 92, 78, 96, 88]:
    scores_bst.insert(score)
print(scores_bst.get_sorted_list())  # [78, 85, 88, 92, 96]

# Range queries
grade_a_scores = scores_bst.range_query(90, 100)  # [92, 96]

# Dictionary/spell checker
dictionary_bst = BinarySearchTree()
for word in ["apple", "banana", "cherry"]:
    dictionary_bst.insert(word)
print("apple" in dictionary_bst)  # True - O(log n) lookup

# Priority systems
task_bst = BinarySearchTree()
task_bst.insert(Task("urgent", 1))
task_bst.insert(Task("normal", 5))
next_task = task_bst.find_min()  # Highest priority (lowest number)

When NOT to Use Binary Search Trees

Avoid for:

  • Frequent random access by index (use arrays/lists)
  • Maintaining insertion order (use lists or queues)
  • Small datasets (overhead not worth it)
  • Data that doesn't have natural ordering
  • When balancing is critical (use AVL/Red-Black trees)

Better alternatives:

# For random access, use list
random_access = [1, 2, 3, 4, 5]
print(random_access[2])  # O(1) direct access

# For insertion order, use list or deque
insertion_order = []
insertion_order.append("first")
insertion_order.append("second")

# For small datasets, simple list may be faster
small_data = [3, 1, 4, 1, 5]
small_data.sort()  # Simple and efficient for small data

# For guaranteed balance, use specialized trees
# (AVL trees, Red-Black trees, or use built-in sorted containers)

Performance Characteristics

Time Complexity Analysis

  • Balanced tree operations: O(log n) for search, insert, delete, min/max
  • Unbalanced tree operations: O(n) worst case (degrades to linked list)
  • Range operations: O(log n + k) where k is number of results
  • Traversal operations: O(n) - must visit all nodes
  • K-th element: O(log n) average, O(n) worst case

Space Complexity

  • Storage: O(n) for n nodes, with parent pointers overhead
  • Recursion stack: O(h) where h is height
  • Worst case height: O(n) for completely unbalanced tree
  • Best case height: O(log n) for perfectly balanced tree

Comparison with Other Tree Types

Tree Type Search Insert Delete Guaranteed Balance Successor/Predecessor
BST O(log n) avg O(log n) avg O(log n) avg No O(log n) avg
AVL Tree O(log n) O(log n) O(log n) Yes O(log n)
Red-Black Tree O(log n) O(log n) O(log n) Yes O(log n)
Binary Tree O(n) O(1) O(n) No O(n)
Heap O(n) O(log n) O(log n) Complete N/A

Performance Optimization Tips

# Build balanced BST from sorted data
def create_balanced_bst(sorted_values):
    """Create balanced BST from sorted array."""
    return build_bst_from_sorted_array(sorted_values)

# Better than inserting in sorted order
sorted_data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# BAD: Creates unbalanced tree (height = n-1)
bad_bst = BinarySearchTree()
for value in sorted_data:
    bad_bst.insert(value)
print(f"Unbalanced height: {bad_bst.height()}")  # 9

# GOOD: Creates balanced tree (height = log n)
good_bst = build_bst_from_sorted_array(sorted_data)
print(f"Balanced height: {good_bst.height()}")    # 3

# For better balance with random insertions, consider shuffling
import random
random_data = sorted_data.copy()
random.shuffle(random_data)

random_bst = BinarySearchTree()
for value in random_data:
    random_bst.insert(value)
print(f"Random insertion height: {random_bst.height()}")  # Usually better than sorted

API Reference

Binary Search Tree Implementation

A binary search tree (BST) is a binary tree where for each node:

  • All values in the left subtree are less than the node's value
  • All values in the right subtree are greater than the node's value
  • Both subtrees are also binary search trees

This implementation inherits from BinaryTree and adds BST-specific functionality with efficient search, insertion, and deletion operations with average time complexity of O(log n) for balanced trees.

BSTNode(data, left=None, right=None)

Bases: BinaryTreeNodeInterface

A node in a binary search tree, extending TreeNodeInterface with parent references.

Initialize a BST node.

Parameters:

Name Type Description Default
data Any

The data to store in the node

required
left Optional[BSTNode]

Left child node (optional)

None
right Optional[BSTNode]

Right child node (optional)

None

__repr__()

Detailed string representation of the node.

__str__()

Return string representation of the node.

get_max()

Find the maximum value node in the subtree rooted at this node.

get_min()

Find the minimum value node in the subtree rooted at this node.

has_both_children()

Check if this node has both left and right children.

has_left_child()

Check if this node has a left child.

has_one_child()

Check if this node has exactly one child.

has_right_child()

Check if this node has a right child.

inorder_predecessor()

Find the inorder predecessor of this node.

inorder_successor()

Find the inorder successor of this node.

is_leaf()

Check if this node is a leaf (has no children).

is_left_child()

Check if this node is a left child of its parent.

is_right_child()

Check if this node is a right child of its parent.

BinarySearchTree(root_data=None)

Bases: BinaryTree

Binary Search Tree implementation inheriting from BinaryTree.

Provides efficient operations for searching, inserting, and deleting values while maintaining the BST property. Overrides parent methods where BST-specific behavior is required.

Initialize the binary search tree.

Parameters:

Name Type Description Default
root_data Any

Optional data for the root node

None

__contains__(data)

Check if a value is in the tree using 'in' operator.

__iter__()

Make the tree iterable (returns values in sorted order).

__repr__()

Detailed representation of the BST.

__str__()

String representation of the BST.

ceiling(value)

Find the smallest value in the tree that is >= value.

Parameters:

Name Type Description Default
value Any

The value to find the ceiling for

required

Returns:

Type Description
Optional[Any]

The ceiling value, or None if no such value exists

contains(data)

Check if a value exists in the BST.

delete(data)

Delete a value from the BST while maintaining BST property.

Parameters:

Name Type Description Default
data Any

The value to delete

required

Returns:

Name Type Description
bool bool

True if deletion was successful, False if value not found

find(data)

Override parent find to use efficient BST search.

find_kth_largest(k)

Find the k-th largest element in the BST (1-indexed).

Parameters:

Name Type Description Default
k int

The position of the element to find (1-indexed)

required

Returns:

Type Description
Optional[Any]

The k-th largest element, or None if k is out of bounds

find_kth_smallest(k)

Find the k-th smallest element in the BST (1-indexed).

Parameters:

Name Type Description Default
k int

The position of the element to find (1-indexed)

required

Returns:

Type Description
Optional[Any]

The k-th smallest element, or None if k is out of bounds

find_max()

Find the maximum value in the tree.

find_min()

Find the minimum value in the tree.

floor(value)

Find the largest value in the tree that is <= value.

Parameters:

Name Type Description Default
value Any

The value to find the floor for

required

Returns:

Type Description
Optional[Any]

The floor value, or None if no such value exists

get_sorted_list()

Get all values in the tree as a sorted list (using inorder traversal).

insert(data)

Insert a value into the BST while maintaining BST property.

Parameters:

Name Type Description Default
data Any

The value to insert

required

Returns:

Name Type Description
bool bool

True if insertion was successful, False if value already exists

insert_left(parent_data, new_data)

BST does not support arbitrary left insertion - use insert() instead.

insert_right(parent_data, new_data)

BST does not support arbitrary right insertion - use insert() instead.

is_valid_bst()

Check if the tree satisfies the BST property.

range_query(min_val, max_val)

Find all values in the given range [min_val, max_val].

Parameters:

Name Type Description Default
min_val Any

Minimum value (inclusive)

required
max_val Any

Maximum value (inclusive)

required

Returns:

Type Description
List[Any]

List of values in the range, sorted in ascending order

search(data)

Search for a value in the BST efficiently using BST property.

Parameters:

Name Type Description Default
data Any

The value to search for

required

Returns:

Name Type Description
BSTNode Optional[BSTNode]

The node containing the value, or None if not found

build_bst_from_sorted_array(arr)

Build a balanced BST from a sorted array.

Parameters:

Name Type Description Default
arr List[Any]

Sorted array of values

required

Returns:

Type Description
BinarySearchTree

A balanced binary search tree

merge_bsts(bst1, bst2)

Merge two BSTs into a single balanced BST.

Parameters:

Name Type Description Default
bst1 BinarySearchTree

First BST

required
bst2 BinarySearchTree

Second BST

required

Returns:

Type Description
BinarySearchTree

A new BST containing all elements from both input BSTs