Keep load on all dom0 and domU as low as possible, disable services etc.
dom0-memory should be 256MB-512MB at least. Pinning it in grub.conf on boottime with parameters like “dom0_mem=” is preferred over letting the memory ballon by linux, already caused problems.
Try to avoid 'file:'-definitions in the config of the domU, use 'tap:io' (the blktap-driver) or 'phy' in favor. file uses loopback-mounting and results in more layers on the way down to the real storage.
Experiment with different xen-schedulers.
When using HyperThreading try using just one thread per cpu-core for pinning with domUs.
Try to dedicate a cpu-core or better a complete cpu to the dom0 (set (dom0-cpus 0) in /etc/xen/xend-config.sxp)
dedicate cores to the domUs. Its better to have domUs/dom0s always use the same cores than having it jumping among cores. Dedicating cpu-cores on one cpu-socket/cpu-packages should be preferred for a dom as they are likely to communicate with each other and this happens faster inside of a cpu-socket.
usb: speedup via PVUSB (paravirtualized usb driver) possible
power management: xen nowadays can put cpus into C-states for powersaving. This is implemented in the xen-kernel, works best on newest cpus (amd younger than K10) and is not as great as the linux-implementation. One of the causes to look into using KVM instead ;)
Scheduler-tuning. Schedulers order i/o to be in the best order to be written to disc. Should be disabled in domUs (this is done in setting scheduler to 'boop') and only be active in dom0: thats the scheduler having the best overview on what has to be written/read by all domUs, and can consider all this for optimization. Such optimizations from the domUs should be disabled. Scheduling is configured through /sys/block/<disc>/queue/scheduler .
HVM domU generic
Do you really need HVM or can you run your domU para? Thats much faster!
Try to use paravirtualized drivers. For linux those can be compiled, there are even para-drivers for kernel 2.4.