Yes, the cache can be beneficial. E.g. You have 64K stripes, and an application requests linear data from the array 64K at a time. A dumb controller would get 64K at a time from the drives, i.e. get about single drive performance. A smart controller with a cache could read-ahead significantly on the data, say reading 64K * 4 for a 4-drive RAID 0, i.e. 256K at a time, and hold that for the application, still feeding it 64K at a time, but quicker because the next 64K is already in cache when the request arrives.
(Reality is a bit more complicated than the above, because drives implement their own read-ahead to some degree.)
Note however that you could have a different stripe size -- e.g. 64K / (number of data drives) = 16K for a 4-drive RAID 0. In this case, even a dumb controller would read data from all drives concurrently, giving close to optimal performance. There's a bit more that a caching controller can do -- for other cases -- and a likelihood that with the additional cost and capacity its implementation would be more sophisticated.
So yes, there can be a significant benefit in some cases. As to whether or not that's worth a significant cost in your applications and usage is another matter.
Note also that using a PCI controller can introduce its own bottleneck -- you'll never get 4-drive performance from a controller that's on the standard PCI bus.