formulation of general dynamic programming problem

https://stackoverflow.com/questions/2255627

20-09-2019
|

Question

I wonder if the objective function of a general dynamic programming problem can always be formulated as in dynamic programming on wiki, where the objective function is a sum of items for action and state at every stage? Or that is just a specical case and what is the general formulation?

EDIT:

By "dynamic programming problem", I mean a problem that can be solved by dynamic programming technique. Such kind of problems possess the property of optimal problem and optimal structure.

But at lease for me it is sometimes not easy to identify such problems, perhaps because I have not become used to that kind of verbal description. When I came across the WIKI page for Bellman equation, I do feel mathematical formulation of the cost function will help somehow. I suspect the overall cost/gain function can always be represented as accumulation of cost/gain from all the stages? and the accumulation can be additive or multiplitive or something else?

When I posted my question, I did realize that it is more proper to discuss dynamic programming in some place more oriented to mathematical optimization. But there are quite a lot of discussion of computer algorithms in Stackoverflow.com. So I did not feel improper to ask my question here either.

Solution

That's not how I would characterize an arbitrary optimization problem (or a dynamic programming algorithm). In particular, the factor β^t looks like an electrical engineering hack that programmers wouldn't usually want. More subtly, it seems like it won't always be obvious what the function F is for a given problem.

But yes, set β to 1 and any arbitrary objective function can be formulated that way. Generally the objective function may be any function of the initial state and all the actions taken; given such a function, it's easy to define a function F to plug into that formula.

Whether that's a useful thing to do or not depends on the problem, I suppose.

OTHER TIPS

in computer science dynamic programming denotes the building of any algorithm in terms of recursively splitting it into subproblems when the same subproblems appear many times in this recursive expansion. A simple book example, Fibonacci numbers can be calculated using dynamic programming:

From the generic recurrence F(n) = F(n-1) + F(n-2) you could implement the following algorithm:

int fibonacci(n):
  if (n < 2): return 1
  else: return fibonacci(n-1) + fibonacci(n-2)

Now this is of course not efficient at all, because it creates a huge number of recursive calls, e.g.

F(8) = F(7) + F(6) = [F(6) + F(5)] + [F(5) + F(4)] = ...

So here we already see that fibonacci(5) is computed twice by the implementation. The dynamic programming paradigm is now to "memoize" or "cache" the results, like this:

integer_map store;
int memofibo(n):
  if (n < 2) : return 1
  else if (store.find_key(n)): return store.find_value(n)
  else:
    int f = memofibo(n-1) + memofibo(n-2)
    store.set(n, f)
    return f

This implementation ensure that the recursive step is executed at most once for every argument value of n, so it calculates the nth Fibonacci number in O(n log n) time (assuming standard O(log n)) implementation of the associative array 'store'.

So from the computer science perspective, the link you provided is the operations research / optimization problem version of the same idea (dividing problem into subproblems), but the idea has been abstracted in practice to this recursion + memoization pattern in the domain of general computer science. I hope this helps to clear some of the clouds.

Folks,

There's a new(ish) website that concentrates on operations research questions here but the low volume of traffic there may not get you a good answer very quickly.

Soapbox time:

For those who care to debate on what's appropriate for stack overflow, let us note that an algorithm is an algorithm regardless of who claims it as part of their field. The simplex method, Djikstra's method, branch and bound, lagrangian relaxation, are all algorithms or methods of solving certain types of problems. Many of these are taught and applied in both fields so the border between OR and CS is pretty blurry in this area.

For instance (and a very strong instance it is) the undergrad course in algorithms at MIT includes all of the following - Randomized Competitive Algorithm, Dynamic Programming, Greedy Algorithms, Minimum Spanning Trees, Shortest Paths, Dijkstra's Algorithm, Bellman-Ford, Linear Programming, Depth-first Search, Topological Sort, and All-pairs Shortest Paths among other topics. I'll defer to MIT in this case.

I like stack overflow because many programmers recognize an optimization problem when they encounter it, but often they just need a little help in deciding how to formulate the problem or even what the problem is called by name.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow