Dynamic programming (DP) is a technique for solving problems by breaking them down into overlapping subproblems with optimal substructure. There are many problems that use DP, such as subset sum, knapsack, coin change, etc. DP can also be applied to trees to solve certain specific problems.
Example of basic subtree DP
Given a tree with N nodes and N−1 edges. It is required to find the maximum sum of node values on a path from the root to some leaf without revisiting nodes.
The figure above shows a tree with N=14 nodes and values:
A greedy approach does not work here: choosing locally maximum values (3→10→5) gives path 7 with sum 18, which is not optimal.
Solution idea
Use tree DP.
Let:
DPi=maximum sum of a path from i to any leaf
Then:
for a leaf: DPi=valuei
for an internal node:
DPi=valuei+u∈children(i)maxDPu
The computation is performed by DFS bottom-up.
We start from the leaves and move upward, each time adding the maximum from the subtree to the current node. In the end, DP1 contains the maximum sum of a path from the root to a leaf.
Problem 1
Problem statement
Given a tree T of N nodes.
Each node i contains Ci coins.
You need to choose a subset of nodes such that:
no two adjacent nodes are chosen
the sum of coins is maximal
Solution
This problem is analogous to the array problem:
dp(i)=max(Ai+dp(i−2),dp(i−1))
In a tree, the state is determined by the subtree of a vertex.
Fix a root (for example, 1).
Let dp(V) be the answer for the subtree of vertex V.
The desired answer: dp(1).
Choosing a vertex
If we choose vertex V:
we cannot choose its children v1,…,vk
we can choose grandchildren
If we do not choose V:
we can choose children
DP formulation
Introduce two states:
dp1(V) — maximum if V is included
dp2(V) — maximum if V is not included
Then:
dp1(V)=CV+u∈children(V)∑dp2(u)
dp2(V)=u∈children(V)∑max(dp1(u),dp2(u))
Answer:
max(dp1(1),dp2(1))
Implementation
Problem 2
Problem statement
Given a tree T of N nodes. It is required to compute the length of the longest path between any two nodes (the diameter of the tree).
Solution
Fix the root of the tree at node 1.
Consider an arbitrary vertex x. Two types of maximum path are possible:
The path starts at x and goes down into its subtree. Denote the length as f(x).
The path passes through x and connects two of its subtrees. Denote the length as g(x).
The diameter of the tree equals:
xmaxmax(f(x),g(x))
Computing f(x)
Let vertex V have descendants v1,v2,…,vn.
By definition:
f(V)=1+max(f(v1),f(v2),…,f(vn))
If there are no descendants (a leaf):
f(V)=1
This is a standard Tree-DP recursion: the value of a node depends on the values of its children.
Computing g(x)
The path can:
start in subtree vi
pass through V
end in subtree vj, i=j
To maximize the length, choose the two largest values among:
f(v1),f(v2),…,f(vn)
Then:
g(V)=2+(two maximum values among f(vi))
Result
For each vertex we update:
diameter=max(diameter,f(V),g(V))
Implementation
Problem 3
Problem statement
Given a tree T of N nodes and an integer K.
It is required to find the number of different connected subtrees of size at most K.
Solution
First, let us clarify the definition.
A subtree is any connected subset of vertices of the original tree.
This differs from the standard subtree of a vertex (including all its descendants).
As usual in tree problems, fix the root at node 1.
Denote by S(V) the standard subtree of vertex V
(all vertices in the subtree of V).
Counting all subtrees
First, count the number of all subtrees (without a size limit).
Let:
f(V)=number of subtrees inside S(V), containing V
If V has children v1,v2,…,vn, then for each child there are two options:
take nothing from its subtree
take a subtree rooted at vi
Therefore:
f(V)=i=1∏n(1+f(vi))
However, this counts only subtrees rooted at V.
We also need to account for subtrees lying inside descendants.
Introduce:
g(V)=number of subtrees in S(V), not containing V
Then:
g(V)=u∈children(V)∑(f(u)+g(u))
The total number of subtrees in the tree:
f(1)+g(1)
Size constraint ≤ K
Now it is required to count only subtrees of size at most K.
Introduce a more precise DP.
DP state
f(V,k)=number of subtrees of size k rooted at V
Then a subtree of size k+1 consists of:
vertex V
children subtrees with total size k
Combining children
Let V have children v1,…,vn.
If from subtree vi we take ai vertices, then:
∑ai=k
and the contribution:
∏f(vi,ai)
Summing over all partitions of k, we get f(V,k+1).
Local DP (knapsack over children)
For computation it is convenient to maintain an intermediate DP.
Let:
dp[i][j]=number of ways to choose j vertices from the first i children
Transition:
dp[i][j]=t=0∑jdp[i−1][j−t]⋅f(vi,t)
In the end:
f(V,k)=dp[n][k−1]
(−1 because one vertex is V itself).
Final answer
We need to sum over all vertices and sizes:
V=1∑Nk=1∑Kf(V,k)
Implementation
Now let’s analyze the complexity. At each node with n children we perform a computation of n⋅K2, so the overall complexity is O(N⋅K2).
Optimization
This is a very useful problem in the world of competitive programming. The solution presented in that blog has complexity O(n⋅K2), but we want to change it to O(n⋅K).
Let Subv denote the size of the subtree of vertex v.
The solution is similar to the solution from that blog, and the difference is very simple. You do not need to iterate over all values of K. Obviously, if i>Subv, then dp[v][i]=0. So each time you need to iterate over values min(K,Subv).
The code becomes as follows:
Proof of complexity
First, introduce the array EnterTime. If EnterTimev=x, then v is the x-th vertex that we entered during the traversal. Each vertex has stv, which is an array, and at the very beginning stv={v} for all v. Later we change this set, and we will see how. We also assume that stv is always ordered by EnterTime.
We want to build a directed graph H of size n. When we are about to update dp[v] from its child vertex u, we add a directed edge from each vertex in stu to each vertex in stv in graph H.
Then we update stv by adding the first K−size(stv) vertices from stu (if size(stv)=K, then this is ignored). We also remove each vertex from stu.
The complexity equals the number of edges in H. We want to prove that the number of edges in H does not exceed n⋅(2K) by proving that each vertex in H has at most 2K outgoing edges.
According to our definition, at any moment in time for any vertex v there exists at most 1 vertex u such that v is in stu. Consider the last set w, which is stu, and we are about to update dp[v] from its child vertex u. Suppose that w is in the p-th position in stu. According to our definition, the edges outgoing from w are edges to the first p−1 vertices in stu.
We know that p≤K, so so far the number of edges outgoing from w is at most K. When we update dp[v], we will add at most K to this number. Therefore, at this moment w has at most 2K outgoing edges. Also, we considered stu as the last set in which w is located. Consequently, in the future there will be no other edges outgoing from w.
Thus, we proved that the number of edges in H does not exceed n⋅(2K), therefore the complexity is O(nK).