[Pkg-opennebula-devel] Bug#657981: opennebula: Cannot scp back stopped VM's image, but gets deleted anyway

Olivier Berger olivier.berger at it-sudparis.eu
Mon Jan 30 16:40:26 UTC 2012


Hi.

On Mon, Jan 30, 2012 at 03:44:58PM +0100, Olivier Berger wrote:
> 
> For whatever strange reason, upon VM stop (onevm stop), the scp back to the one "master" of the stopped VM's image fails. But the image is deleted anyway from the node, discarding all changes made to the VM :-(
> 
> Here's a log :
> Mon Jan 30 15:35:23 2012 [LCM][I]: New VM state is SAVE_STOP
> Mon Jan 30 15:35:25 2012 [LCM][I]: New VM state is EPILOG_STOP
> Mon Jan 30 15:35:39 2012 [TM][I]: Command execution fail: /usr/lib/one/tm_commands/ssh/tm_mv.sh esther:/var/lib/one//10/images esther:/var/lib/one/10
> Mon Jan 30 15:35:39 2012 [TM][I]: STDERR follows.
> Mon Jan 30 15:35:39 2012 [TM][I]: ERROR MESSAGE --8<------
> Mon Jan 30 15:35:39 2012 [TM][I]: scp: /var/lib/one/10/images/disk.1: Permission denied
> Mon Jan 30 15:35:39 2012 [TM][I]: ERROR MESSAGE ------>8--
> Mon Jan 30 15:35:39 2012 [TM][I]: ExitCode: 1
> Mon Jan 30 15:35:39 2012 [TM][I]: tm_mv.sh: Moving /var/lib/one//10/images
> Mon Jan 30 15:35:39 2012 [TM][I]: tm_mv.sh: Executed "/usr/bin/ssh esther mkdir -p /var/lib/one".
> Mon Jan 30 15:35:39 2012 [TM][I]: tm_mv.sh: ERROR: Command "/usr/bin/scp -r esther:/var/lib/one//10/images esther:/var/lib/one/10" failed.
> Mon Jan 30 15:35:39 2012 [TM][I]: tm_mv.sh: ERROR: scp: /var/lib/one/10/images/disk.1: Permission denied
> Mon Jan 30 15:35:39 2012 [TM][E]: Error excuting image transfer script: scp: /var/lib/one/10/images/disk.1: Permission denied
> Mon Jan 30 15:35:39 2012 [DiM][I]: New VM state is FAILED
> Mon Jan 30 15:35:39 2012 [TM][W]: Ignored: LOG - 10 tm_delete.sh: Deleting /var/lib/one//10/images
> 
> Mon Jan 30 15:35:39 2012 [TM][W]: Ignored: LOG - 10 tm_delete.sh: Executed "/usr/bin/ssh esther rm -rf /var/lib/one//10/images".
> 
> Mon Jan 30 15:35:39 2012 [TM][W]: Ignored: TRANSFER SUCCESS 10 -
> 
> In the above, the master and node are the same, btw.
> 

Actually, this is mainly linked to the fact that I've used the same machine for the master/front-end and for a node, but using the directives for tm_ssh and not tm_nfs.

I'm not sure the Debian provided defaults and instructions in README.Debian are quite the best choices, thus.

If one uses the same machine for frontend and a node, and does something like :
$ onehost add samehost im_kvm vmm_kvm tm_ssh

Then this bug will happen. The scripts may be rendered more resiliant in testing if source and dest are the same in the MV action for the tm_ssh script, but that's more an upstream problem, I think.

Another option, which seems to be much safer is to prefer :
$ onehost add samehost im_kvm vmm_kvm tm_nfs

And then, no real transfer is made, not removing the VM image upon stop... provided that the tm_nfs TM_MAD is enabled in /etc/one/oned.conf.


So I think it may be safer to provide directives in README.Debian that mention that the frontend and node can be same machine, but with the tm_nfs choice (and activation in oned.conf).

It will help avoiding novice users who will lose their VM images because they used only one physical machine to test both opennebula and opennebula-node on the same host.

I hope this sounds reasonable to the maintainers.

Best regards,
-- 
Olivier BERGER 
http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
Ingenieur Recherche - Dept INF
Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)






More information about the Pkg-opennebula-devel mailing list