Problem Statement
We are given a tree with vertices and need to process queries.
Each query gives a set of vertices
We are given a tree with vertices and need to process queries.
Each query gives a set of vertices
For this set, we need to compute the sum of distances over all pairs of chosen vertices:
where is the distance, that is, the number of edges on the path between and .
If all vertices of the tree are included in the query, then the problem becomes much simpler.
We can compute the contribution of each edge independently and add all contributions to the answer.
This leads to an solution.
This idea also works for weighted trees.
In the general case, we do not want to process the entire tree for every query.
Instead, we build a virtual tree (also called an auxiliary tree) for the queried vertices and reduce the problem to a much smaller tree.
Given a set of vertices
the virtual tree is the minimal set of vertices
such that is closed under the LCA operation.
That means:
So the virtual tree consists only of:
This makes it much smaller than the original tree.
As an example, consider the following tree with vertices, and suppose we want to build the virtual tree for the set

To build the virtual tree for a set :
The stack construction works because the vertices are processed in DFS order.
The total complexity of building the virtual tree is:
assuming LCA queries are already available.
void VirtualTree(vector<int> v) {
// sort by DFS entry time
sort(v.begin(), v.end(), [&](int x, int y) {
return tin[x] < tin[y];
});
vector<int> s = v;
for (int i = 1; i < (int)v.size(); i++) {
s.push_back(lca(v[i], v[i - 1]));
}
sort(s.begin(), s.end(), [&](int x, int y) {
return tin[x] < tin[y];
});
s.erase(unique(s.begin(), s.end()), s.end()); // remove duplicates
// s[0] is the root of the virtual tree
vector<int> st = {s[0]};
for (int i = 1; i < (int)s.size(); i++) {
while (!st.empty() && !is_parent(st.back(), s[i])) {
st.pop_back();
}
add_edge(st.back(), s[i]);
st.push_back(s[i]);
}
}