Welcome, Guest
Username: Password: Remember me

TOPIC: How to configure my cluster to run telemac on it?

How to configure my cluster to run telemac on it? 1 year 1 week ago #42554

  • josiastud
  • josiastud's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 132
  • Thank you received: 2
hi everybody,

I made a little cluster for running telemac on it (I really need more power),
I use 3 nodes for now, I plan to add 2 more, but I need first to set things right.
I saw previous thread about it on this forum but it's a 5/7 years old post. Even if it helped me in the choice of the hardware and someways on the software, I still fail.

what I have/ have done ?

--3 desktops icore5 each but different generations (1*2nd gen and 2*3rd gen.)
--ubuntu 20.04 server install on each one
--I followed the instructions of this video ,

instructions that can be resumed in:
1) setting the local area network (I used a box and the dhcp mode to distribute the adress, I connect the desktop in a certain order for having each time the same ip adresses), (I also have 2 tp-link switches, a 10/100Mbps and a 10/100/1000 switch calleg gigabit ) (I made the Ethernet cables using a cat6 )
2) installing and setting the Secure shell (ssh)
3) setup the Network File System (NFS)
4) compilers and MPI and MPICH
5) Testing

After that ,
- I installed Ubuntu-Desktop-gui on the master node (and I don't know How but a different desktop-gui was also installed like automatically on the other nodes).
- I installed telemac v8p4 following the instruction of hydro-informatics.com on the master node and I shared this file as a NFS file for the other nodes.

after failing to run simulations on many nodes, I installed slurm following
the instruction of:
drtailor.medium.com/how-to-setup-slurm-o...eduling-6cc909574365 and
nekodaemon.com/2022/09/02/Slurm-Quick-In...ter-on-Ubuntu-20-04/

my slurm comfig file is attached

And here I am! with my simulations failing sometimes, simulations on 2 nodes taking 2* or 3* or 5* time sim. on a single node.

I cannot run multiple simulations when setting the nodes inside the telemac2d/3d.py (telemac2d.py casfiles.txt --ncsize=8 --ncnode=2 --hosts='ben0,ben1') but outside the t2/3d and only with 'srun' (srun --nodes=2 --ntasks=8 --nodelist=ben[0,1] telemac2d.py casfile.txt )

I am done now :( and
Time is my worst enemy


Thx for the help,

Josias
Attachments:
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.