GlusterFS in an Enterprise setting

About a year ago I began my experiment with GlusterFS as a 0DT SAN. There were a few hiccups at the beginning, and a couple of version changes to get to the point where I consider it stable, but now I don’t know how anyone lives without it.

My test configuration started with 3 servers and 4 drives each. I created single LVM volumes on each drive, consisting of half the drive so I have room for snapshots. I mount them each under /cluster as 0,1,2, and 3 and share them using glusterfsd.

volume posix0
type storage/posix # POSIX FS translator
option directory /cluster/0 # Export this directory
end-volume

volume locks0
type features/locks # Implement posix locks, not working 100%
subvolumes posix0
end-volume

volume brick0
type performance/io-threads # Performance enhancement
option thread-count 8
subvolumes locks0
end-volume

volume posix1
type storage/posix
option directory /cluster/1
end-volume

volume locks1
type features/locks
subvolumes posix1
end-volume

volume brick1
type performance/io-threads
option thread-count 8
subvolumes locks1
end-volume

volume posix2
type storage/posix
option directory /cluster/2
end-volume

volume locks2
type features/locks
subvolumes posix2
end-volume

volume brick2
type performance/io-threads
option thread-count 8
subvolumes locks2
end-volume

volume posix3
type storage/posix
option directory /cluster/3
end-volume

volume locks3
type features/locks
subvolumes posix3
end-volume

volume brick3
type performance/io-threads
option thread-count 8
subvolumes locks3
end-volume

volume server
type protocol/server
option transport-type tcp
subvolumes brick0 brick1 brick2 brick3
option auth.addr.brick0.allow *
option auth.addr.brick1.allow *
option auth.addr.brick2.allow *
option auth.addr.brick3.allow *
end-volume

Each of the clients then connects to the 3 servers and handles AFR and distribution, effectively doing a raid 1(3)+0(4).

volume fs1_cluster0
type protocol/client
option transport-type tcp
option remote-host fs1.ewcs.local
option remote-subvolume brick0
end-volume

volume fs1_cluster1
type protocol/client
option transport-type tcp
option remote-host fs1.ewcs.local
option remote-subvolume brick1
end-volume

volume fs1_cluster2
type protocol/client
option transport-type tcp
option remote-host fs1.ewcs.local
option remote-subvolume brick2
end-volume

volume fs1_cluster3
type protocol/client
option transport-type tcp
option remote-host fs1.ewcs.local
option remote-subvolume brick3
end-volume

volume fs2_cluster0
type protocol/client
option transport-type tcp
option remote-host fs2.ewcs.local
option remote-subvolume brick0
end-volume

volume fs2_cluster1
type protocol/client
option transport-type tcp
option remote-host fs2.ewcs.local
option remote-subvolume brick1
end-volume

volume fs2_cluster2
type protocol/client
option transport-type tcp
option remote-host fs2.ewcs.local
option remote-subvolume brick2
end-volume

volume fs2_cluster3
type protocol/client
option transport-type tcp
option remote-host fs2.ewcs.local
option remote-subvolume brick3
end-volume

volume fs3_cluster0
type protocol/client
option transport-type tcp
option remote-host fs3.ewcs.local
option remote-subvolume brick0
end-volume

volume fs3_cluster1
type protocol/client
option transport-type tcp
option remote-host fs3.ewcs.local
option remote-subvolume brick1
end-volume

volume fs3_cluster2
type protocol/client
option transport-type tcp
option remote-host fs3.ewcs.local
option remote-subvolume brick2
end-volume

volume fs3_cluster3
type protocol/client
option transport-type tcp
option remote-host fs3.ewcs.local
option remote-subvolume brick3
end-volume

volume repl0
type cluster/replicate
subvolumes fs1_cluster0 fs2_cluster0 fs3_cluster0
end-volume

volume repl1
type cluster/replicate
subvolumes fs1_cluster1 fs2_cluster1 fs3_cluster1
end-volume

volume repl2
type cluster/replicate
subvolumes fs1_cluster2 fs2_cluster2 fs3_cluster2
end-volume

volume repl3
type cluster/replicate
subvolumes fs1_cluster3 fs2_cluster3 fs3_cluster3
end-volume

volume distribute
type cluster/distribute
subvolumes repl0 repl1 repl2 repl3
end-volume

volume writebehind
type performance/write-behind
option aggregate-size 32MB
option cache-size 64MB
subvolumes distribute
end-volume

volume ioc
type performance/io-cache
option cache-size 64MB
subvolumes writebehind
end-volume

This configuration allows for any 2 drives in a replicate set to fail, or even any 2 servers, without loss of operation. Only 1/6th of the entire available storage capacity is shared, but with the dramatically reduced cost of storage over other redundant systems, this is a negligible problem.

Here’s a graphic representation of the configuration:

3 server, 12 disk, triple redundant distributed glusterfs design

Advertisements

~ by edwyseguru on June 11, 2010.

3 Responses to “GlusterFS in an Enterprise setting”

  1. Hi. I like your setup. I’m curious to know how you would scale this setup to 10 or 15 servers.

  2. […] [4] GlusterFS-Design: https://edwyseguru.wordpress.com/2010/06/11/glusterfs-design/ […]

  3. […] intergalactique. De fil en aiguille, on tombe sur ce premier billet, du même auteur, et surtout sur cet exemple grandeur nature dans lequel l’auteur explique son setup de type raid 1(3)+0(4). […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: