Xsan 4 - Volumes not appearing in Server app after Open Directory reconfig

cutmoney's picture
Tags: 

I'm running Xsan 4 and here is my issue. I've been having issues with my open directory settings in that I was unable to make any changes, I believe something became corrupted in some way. In any case, failover to my backup MDC was not working properly and I could not successfully add new profiles to clients as it would not authenticate. I had to destroy and recreate my open directory config on my primary MDC (I did archive everything). When I turned Xsan back on in Server app, the SAN volume mounted on the primary MDC, but the all of the data in the Server app is blank. It shows the connection as active, but shows that no volumes are configured, and also shows no servers.

So, the volume mounted on the primary MDC and it mounts successfully on one other machine that has an old config file. I've since deleted the config files on the other clients with the plan of installing the new config files after I fixed the open directory issue. Has anyone else had this issue and do you know of a way to get Server app to recognize that the volume is present and mounted. I'm not sure, but I'm assuming if I recreate the volume fresh it will destroy the data. Alternatively, is it possible to delete the /Library/Preferences/Xsan, create a new volume, and then replace it with my existing .cfg file (which I still have archived).

kworq's picture

This is an old Post but I hope this info will help anyone with this issue. Its not that uncommon to need to trash OD and start over. Corruption, change of host name etc.

First creating a new Volume containing LUNS that where part of an old san will destroy the data. Second never delete anything without a backup. Server BU, Xsan BU, Xsan Config BU, Xsan Metadata dump, etc

Unlike the past versions of Xsan, Xsan 4 stores an Xsan config group in LDAP. If you trash OD you in turn trash the Xsan SAN info as well. This prevents the GUI from reading the config and prevents Clients from installing the profile to connect to the SAN.

The best case scenario would be to have an archive of your OD to restore to in the event of corruption. If you do not have a backup here is what you could do if you still have your Xsan config directory (/Library/Preferences/Xsan).

Use cvadmin and stop and unmount the volumes
Backup Xsan Config - sudo cp -r /Library/Preferences/Xsan /Library/Preferences/Xsan.old
Destroy you OD MASTER in server app GUI
Move server app to the trash
Destroy any remnants of OD - sudo slapconfig -destroyldapserver
Remove Certs - sudo rm -R /var/root/Library/Application\ Support/Certificate\ Authority
Remove Kerberos - sudo rm -R /var/db/krb5kdc
Remove Xsan Config (MAKE SURE YOU HAVE BACKUP) - sudo rm -r /Library/Preferences/Xsan
Move server app back to applications
In Server App Create New OD MASTER
In Server App turn on Xsan (For SAN NAME use the exact same name you had used before)
This creates new Xsan config files and a new xsan config group in LDAP
Close Server app
Copy the original xsan config files and fsmlist you made a backup of into /Library/Preferences/Xsan
Copy those config files into LDAP - sudo xsanctl pushConfigUpdate
Open server app
You should now have your original volumes listed in the Xsan pan
You can start the SAN from the GUI

Once you have your users and groups back make a new archive of OD and Backup of /Library/Preferences/Xsan

billgarmen's picture

I am in the same situation and have followed this, When I do the sudo xsanctl pushConfigUpdate I get the following:

bash-3.2# sudo xsanctl pushConfigUpdate
2015-07-22 17:55:31.088 xsanctl[1751:25017] buildSanConfig started
2015-07-22 17:55:31.089 xsanctl[1751:25017] buildSanConfig about to check LDAP
2015-07-22 17:55:31.106 xsanctl[1751:25017] buildSanConfig: getXsanConfig said nothing
2015-07-22 17:55:31.109 xsanctl[1751:25017] buildSanConfig returning error Error Domain=com.apple.OpenDirectory Code=4102 "Could not create the record because one already exists with the same name." UserInfo=0x7fd42ac0d110 {NSLocalizedDescription=Could not create the record because one already exists with the same name., NSLocalizedFailureReason=Could not create the record because one already exists with the same name.}
xsanctl: error pushing configuration: Error Domain=com.apple.OpenDirectory Code=4102 "Could not create the record because one already exists with the same name." UserInfo=0x7fd42ac0d110 {NSLocalizedDescription=Could not create the record because one already exists with the same name., NSLocalizedFailureReason=Could not create the record because one already exists with the same name.}
bash-3.2#

I am not sure what already exists or what else I need to do. Thoughts?

wrstuden's picture

billgarmen wrote:

I am in the same situation and have followed this, When I do the sudo xsanctl pushConfigUpdate I get the following:

bash-3.2# sudo xsanctl pushConfigUpdate
2015-07-22 17:55:31.088 xsanctl[1751:25017] buildSanConfig started
2015-07-22 17:55:31.089 xsanctl[1751:25017] buildSanConfig about to check LDAP
2015-07-22 17:55:31.106 xsanctl[1751:25017] buildSanConfig: getXsanConfig said nothing

I am not sure what already exists or what else I need to do. Thoughts?

/quote
Something is messed up in your new LDAP configuration. xsanctl is using ldapi to connect to the local server. When it asks for the com.apple.xsan.conf.SANNAME, it gets no response even though the record exists. Alternatively you don't have the group but you have the com.apple.xsan.auth.SANNAME user.

This state is very odd to be in if you just configured the SAN again.

billgarmen's picture

So I destroyed the server, and reinstalled everything to try and get these steps to work to no avail. This is the current error I am getting

srpdm01:~ dmserver$ sudo xsanctl pushConfigUpdate
Password:
2015-08-05 19:23:00.150 xsanctl[3693:33000] buildSanConfig started
2015-08-05 19:23:00.151 xsanctl[3693:33000] buildSanConfig about to check LDAP
2015-08-05 19:23:00.169 xsanctl[3693:33000] buildSanConfig step 4 with {
globals = {
certSetRevision = "306BD2C8-2A7C-4921-9CFF-FCB0358F79DE";
controllers = {
"3F9770A9-9F8C-5508-8B4D-BB4C83C461A0" = {
IPAddress = "10.11.2.207";
hostName = "srpdm01.srp.gov";
};
};
fsnameservers = (
{
addr = "10.11.2.207";
uuid = "3F9770A9-9F8C-5508-8B4D-BB4C83C461A0";
}
);
notifications = {
FreeSpaceThreshold = 20;
};
revision = "CA9D3D5E-2747-4790-953C-BA87A7FB301E";
sanAuthMethod = "auth_secret";
sanConfigURLs = (
"ldaps://srpdm01.srp.gov:389"
);
sanName = "SRP_DMS_XSAN";
sanState = active;
sanUUID = "EFEED457-0002-44EE-ADA7-BAACCB7C3DBA";
sharedSecret = "********";
};
volumes = {
};
}
2015-08-05 19:23:00.169 xsanctl[3693:33000] buildSanConfig returning error Error Domain=com.apple.xsan.ErrorDomain Code=503 "San UUID mismatch" UserInfo=0x7f811ae10540 {NSLocalizedDescription=San UUID mismatch}
xsanctl: error pushing configuration: Error Domain=com.apple.xsan.ErrorDomain Code=503 "San UUID mismatch" UserInfo=0x7f811ae10540 {NSLocalizedDescription=San UUID mismatch}
srpdm01:~ dmserver$

I can not find were the server is pulling the second UUID from
sanUUID = "EFEED457-0002-44EE-ADA7-BAACCB7C3DBA";

Looking at all the files in the xsan preference folder I can not find the second UUID, the the first UUID (3F9770A9-9F8C-5508-8B4D-BB4C83C461A0) matches the machine. But I am not sure where the second one is.

Any help would appreciated.
thanks

billgarmen's picture

Also I can cvadmin into the xsan, see it and start and stop it. I just cant get it to show up in the server GUI

billgarmen's picture

I rebuilt the server to try again from a clean install and now I am having this new error, below is a copy from Termial

srpdm01:~ dmserver$ sudo xsanctl pushConfigUpdate
Password:
2015-08-05 19:23:00.150 xsanctl[3693:33000] buildSanConfig started
2015-08-05 19:23:00.151 xsanctl[3693:33000] buildSanConfig about to check LDAP
2015-08-05 19:23:00.169 xsanctl[3693:33000] buildSanConfig step 4 with {
globals = {
certSetRevision = "306BD2C8-2A7C-4921-9CFF-FCB0358F79DE";
controllers = {
"3F9770A9-9F8C-5508-8B4D-BB4C83C461A0" = {
IPAddress = "10.11.2.207";
hostName = "srpdm01.srp.gov";
};
};
fsnameservers = (
{
addr = "10.11.2.207";
uuid = "3F9770A9-9F8C-5508-8B4D-BB4C83C461A0";
}
);
notifications = {
FreeSpaceThreshold = 20;
};
revision = "CA9D3D5E-2747-4790-953C-BA87A7FB301E";
sanAuthMethod = "auth_secret";
sanConfigURLs = (
"ldaps://srpdm01.srp.gov:389"
);
sanName = "SRP_DMS_XSAN";
sanState = active;
sanUUID = "EFEED457-0002-44EE-ADA7-BAACCB7C3DBA";
sharedSecret = "********";
};
volumes = {
};
}
2015-08-05 19:23:00.169 xsanctl[3693:33000] buildSanConfig returning error Error Domain=com.apple.xsan.ErrorDomain Code=503 "San UUID mismatch" UserInfo=0x7f811ae10540 {NSLocalizedDescription=San UUID mismatch}
xsanctl: error pushing configuration: Error Domain=com.apple.xsan.ErrorDomain Code=503 "San UUID mismatch" UserInfo=0x7f811ae10540 {NSLocalizedDescription=San UUID mismatch}
srpdm01:~ dmserver$

I can not find were the server is pulling the second UUID from
sanUUID = "EFEED457-0002-44EE-ADA7-BAACCB7C3DBA";

Looking at all the files in the xsan preference folder I can not find the second UUID, the the first UUID (3F9770A9-9F8C-5508-8B4D-BB4C83C461A0) matches the machine. But I am not sure where the second one is.

Any help would be appreciated.

wrstuden's picture

sanUUID mismatch means that there is a mismatch between the UUID for your SAN in LDAP and the UUID in your config.plist. It means the SAN got changed out from under the computer; the SAN named "SRP_DMS_XSAN" isn't the SAN you started with.

It sounds like you don't have the SAN fully set up in the GUI. You'll need that set up for this command to work right. Or you'll have to totally wipe the LDAP config and start over. Note that wiping the LDAP config and creating a new SAN will cause all clients to start hitting the sanUUID mismatch issue.