[Spatdif] Organizing SpatDIF by layers

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[Spatdif] Organizing SpatDIF by layers

Trond Lossius
Administrator
Hi,

In follow-up of the Berlin meeting earlier this fall I have finally gotten the time to go through the SpatDIF spec (v.0.2 proposal), and sort the different descriptors by layers, according to the stratified model outlined in the 2009 SMC paper [1]. This is something that I have felt the need to do for a long while.

Not all descriptors are immediately obvious to place in one particular layer, and it might be further discussed if the classification proposed below should be further adjust. Also, for the time being I am ignoring the distinguishing between core and extension, although I am convinced that it might be fruitful to permit extensions. Likewise I'm disregarding what descriptors are already detailed, and what need to be developed in the future.

Anyway I believe it illustrates how we currently are addressing descriptors at more or less all layers. Having done the sorting below, and then re-reading the proposed specification, I am left with the impression that it would be beneficially to restructure descriptors by layers. In hindsight the current proposals seems to be well on its way towards becoming cluttered by trying to deal with everything at the same time.

The sorting below also makes it clear to me that it would be relatively easy and fast to finalize a first specification for the 5th layer (scene description), so that we could develop a number of implementations in e.g. Jamoma, ICST-tools and Iosono software. From discussions in the meeting this also seemed like the layer where we might have the most instant gratification, if we would be able to achieve interoperability between Jamoma, ICST and Iosono (and possibly Ircam's Open Music).

It should also be possible to start detailing specifications for 4th (encoding) and 3rd (decoding)layers.

The 6th layer (authoring) is the one that seems the least developed, with the most haphazard current selection of descriptors that we have so far been concerned with (with the exception of the bottom-most hardware layer, but I am not sure we will need a descriptive format addressing the hardware layer at all).

On a side note I now find the use of the term "layer" misleading for distinguishing between core (or basic) and extension (or advanced) descriptors in sec. 2 and in particular fig. 2. I believe we should reserve "layer" for discussions relating to the stratified model.

So, I'll be curious to hear feedback on whether the following is perceived as a meaningful strategy for structuring descriptors, and if it is an approach that might be productive to continue for future work on SpatDIF?

Best,
Trond





Conventions, descriptors and methods that might be applicable at several different layers:

- A division into meta and scene sections (sec. 2.1-2.2)
- Time description conventions (sec. 2.2.1, app. B)
- Gain Unit conversions (app. A)
- Coordinate systems conventions (app. C)
- Orientation systems conventions (app. D)
- Private (sec. 3.4.3)


6th Layer - Authoring

- Loop (sec. 3.4.2)
- Geo-transform (sec. 4.1.1)
- Group (sec. 4.1.2) - this might be desireable at the 5th layer as well, and could be considered as a way of introducing compound surces
- Interpolation (sec. 4.1.1)
- Trajectory (sec. 4.1.4)


5th Layer - Scene Description

- Sources (sec. 3.1-3.2)
        - Point
        - Compound source (group) (sec. 4.1.2)
- Source directivity (sec. 4.2.1)
- Media (sec. 3.3.1)


4th Layer - Encoding

- Descriptors specific to rendering techniques:
        - Distance-cues (sec. 3.4.1)
        - Reverb (sec. 4.2.2)
        - Ambisonic encoding (sec. 4.3.1)
        - Binaural (this technique will generally not be able to separate encoding and decoding)
        - Direct to one speaker only (sec. 3.3.2)


3rd Layer - Decoding

- Sinks (sec. 3.1-3.2)
        - Speaker
        - Listener
        - Microphone
- sink directivity (sec. 4.2.1)
- Rendering-specific:
        - Ambisonic decoding (sec. 4.3.1)


2nd Layer - Hardware Abstraction

- Hardware-out (sec. 3.3.2)


1st Layer - Physical Device Layer




REFERENCES:

[1] Peters N., Lossius T., Schacher J., Baltazar P., Bascou C. & Place T. (2009): A stratified approach for sound spatialization. Proceedings of The 6th Sound and Music Computing Conference, 23-25 July 2009, Porto. http://www.spatdif.org/papers/Spatialization-SMC09.pdf
_______________________________________________
Spatdif mailing list
[hidden email]
https://mail.bek.no/mailman/listinfo/spatdif
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [Spatdif] Organizing SpatDIF by layers

Nils Peters

Hi Trond,

thanks for the work.
it is interesting to see the specs from the perspective of our SMC paper.
I created the attached figure (which is based on some previous plots) to
visualize your classification.
here some comments:
Interestingly, out core descriptors are not present in all layers, they
probably don't have to.
I'm not sure if there should be grouping in Layer 5., but maybe there is
an advantage for having grouping information for a clever encoding
algorithm at the following layer.

cheers,

Nils





On 11-11-17 5:47 AM, Trond Lossius wrote:

> Hi,
>
> In follow-up of the Berlin meeting earlier this fall I have finally gotten the time to go through the SpatDIF spec (v.0.2 proposal), and sort the different descriptors by layers, according to the stratified model outlined in the 2009 SMC paper [1]. This is something that I have felt the need to do for a long while.
>
> Not all descriptors are immediately obvious to place in one particular layer, and it might be further discussed if the classification proposed below should be further adjust. Also, for the time being I am ignoring the distinguishing between core and extension, although I am convinced that it might be fruitful to permit extensions. Likewise I'm disregarding what descriptors are already detailed, and what need to be developed in the future.
>
> Anyway I believe it illustrates how we currently are addressing descriptors at more or less all layers. Having done the sorting below, and then re-reading the proposed specification, I am left with the impression that it would be beneficially to restructure descriptors by layers. In hindsight the current proposals seems to be well on its way towards becoming cluttered by trying to deal with everything at the same time.
>
> The sorting below also makes it clear to me that it would be relatively easy and fast to finalize a first specification for the 5th layer (scene description), so that we could develop a number of implementations in e.g. Jamoma, ICST-tools and Iosono software. From discussions in the meeting this also seemed like the layer where we might have the most instant gratification, if we would be able to achieve interoperability between Jamoma, ICST and Iosono (and possibly Ircam's Open Music).
>
> It should also be possible to start detailing specifications for 4th (encoding) and 3rd (decoding)layers.
>
> The 6th layer (authoring) is the one that seems the least developed, with the most haphazard current selection of descriptors that we have so far been concerned with (with the exception of the bottom-most hardware layer, but I am not sure we will need a descriptive format addressing the hardware layer at all).
>
> On a side note I now find the use of the term "layer" misleading for distinguishing between core (or basic) and extension (or advanced) descriptors in sec. 2 and in particular fig. 2. I believe we should reserve "layer" for discussions relating to the stratified model.
>
> So, I'll be curious to hear feedback on whether the following is perceived as a meaningful strategy for structuring descriptors, and if it is an approach that might be productive to continue for future work on SpatDIF?
>
> Best,
> Trond
>
>
>
>
>
> Conventions, descriptors and methods that might be applicable at several different layers:
>
> - A division into meta and scene sections (sec. 2.1-2.2)
> - Time description conventions (sec. 2.2.1, app. B)
> - Gain Unit conversions (app. A)
> - Coordinate systems conventions (app. C)
> - Orientation systems conventions (app. D)
> - Private (sec. 3.4.3)
>
>
> 6th Layer - Authoring
>
> - Loop (sec. 3.4.2)
> - Geo-transform (sec. 4.1.1)
> - Group (sec. 4.1.2) - this might be desireable at the 5th layer as well, and could be considered as a way of introducing compound surces
> - Interpolation (sec. 4.1.1)
> - Trajectory (sec. 4.1.4)
>
>
> 5th Layer - Scene Description
>
> - Sources (sec. 3.1-3.2)
> - Point
> - Compound source (group) (sec. 4.1.2)
> - Source directivity (sec. 4.2.1)
> - Media (sec. 3.3.1)
>
>
> 4th Layer - Encoding
>
> - Descriptors specific to rendering techniques:
> - Distance-cues (sec. 3.4.1)
> - Reverb (sec. 4.2.2)
> - Ambisonic encoding (sec. 4.3.1)
> - Binaural (this technique will generally not be able to separate encoding and decoding)
> - Direct to one speaker only (sec. 3.3.2)
>
>
> 3rd Layer - Decoding
>
> - Sinks (sec. 3.1-3.2)
> - Speaker
> - Listener
> - Microphone
> - sink directivity (sec. 4.2.1)
> - Rendering-specific:
> - Ambisonic decoding (sec. 4.3.1)
>
>
> 2nd Layer - Hardware Abstraction
>
> - Hardware-out (sec. 3.3.2)
>
>
> 1st Layer - Physical Device Layer
>
>
>
>
> REFERENCES:
>
> [1] Peters N., Lossius T., Schacher J., Baltazar P., Bascou C.&  Place T. (2009): A stratified approach for sound spatialization. Proceedings of The 6th Sound and Music Computing Conference, 23-25 July 2009, Porto. http://www.spatdif.org/papers/Spatialization-SMC09.pdf
> _______________________________________________
> Spatdif mailing list
> [hidden email]
> https://mail.bek.no/mailman/listinfo/spatdif
>

_______________________________________________
Spatdif mailing list
[hidden email]
https://mail.bek.no/mailman/listinfo/spatdif

SpatDIF-layers2.pdf (122K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [Spatdif] Organizing SpatDIF by layers

Jean Bresson
In reply to this post by Trond Lossius
Hi list,

I though it could be of some interest to the list to post an update about the situation in OpenMusic / Spat scene description and communication.



So far we proposed and implemented our spatial scene representation and encoding using the SDIF format: 

- Trajectories are stored as sampled data streams, and therefore we kind of make the assumption that any generative process and decision about sampling etc. is done prior to the file-writing stage. 
(It is possible however for the "reader" of this file to choose to apply post-processing, e.g. for sampling or smoothing).
- SDIF embeds the concept of multiple interleaved (and independent) streams containing time-tagged frames (no sample rate) : We consider the different streams represents one source each (and therefore do not / can not really describe "hierarchical" scene graphs so far).
- In the SDIF frames are stored the 3D cartesian coordinates of the source at time t (we can eventually use spherical or polar coordinates as well) as well as most of the spatialization parameters available in the Ircam Spatialisateur (which is for the moment the only known compatible renderer system) : aperture, orientation, perceptual parameters, etc.  [see the proposed/implemented type descriptiors here]
- We use "NVTs" in the file header ("name-value tables") to store optional info (name, pathnames, etc.) about the different sources.

OpenMusic can write such SDIF description files from its internal (matrix-like) sound scene representations (via OM-Spat library -- Ircam Forum).
A command line version of Ircam's Spat (also distrib. Forum) has been developed by Thibaut Carpentier, which performs offline spatial rendering given a loudspeaker configuration and the SDIF scene description file (SDIF frames simply more or less behave as control messages which would be sent at time t to the Spat).

We also developed a Max standalone application (Spat-SDIF-Player [*]) which uses MuBu buffering toolkit to load and control the streaming of the SDIF file data as OSC messages. 
These messages, as far as possible, comply with the SpatDIF specification. They can be received and decoded in principle by any OSC/SpatDIF compliant application. 

In the last Ircam Spat distribution is included a new object called "SpatDIF-to-Spat" allowing to convert incoming SpatDIF messages to relevant command messages spatoper or spat~. 

This work has been partly described in an ICMC paper available on the SpatDIF wiki or on this page of the repmus website.
and all of it has been carried out in collaboration with Marlon Schumacher and Thibaut Carpentier.



Best regards,

Jean Bresson

[*] Download links available here.



Le 17 nov. 2011 à 14:47, Trond Lossius a écrit :

The sorting below also makes it clear to me that it would be relatively easy and fast to finalize a first specification for the 5th layer (scene description), so that we could develop a number of implementations in e.g. Jamoma, ICST-tools and Iosono software. From discussions in the meeting this also seemed like the layer where we might have the most instant gratification, if we would be able to achieve interoperability between Jamoma, ICST and Iosono (and possibly Ircam's Open Music).


_______________________________________________
Spatdif mailing list
[hidden email]
https://mail.bek.no/mailman/listinfo/spatdif
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [Spatdif] Organizing SpatDIF by layers

Trond Lossius
Administrator
In reply to this post by Nils Peters
Hi Nils,

> it is interesting to see the specs from the perspective of our SMC paper.
> I created the attached figure (which is based on some previous plots) to visualize your classification.

Thanks!

Looking at it, I feel a need for some changes at various layers:

6th layer:

- I believe we eventually will need to find a totally different approach to all of this unless it's going to evolve into a pretty arbitrary and endless series of extensions. We'll first need to be able to answer the fundamental question about whether this layer can at all be standardized. Looking at the boxes currently present (and adding splines as suggested by Matthias in Berlin) it seems to lack a meta-perspective, a way of thinking that can encompass all approaches rather than just end up as an endless addition of new algorithms. I don't currently know how to go about this, and addressing it would probably be a major research project (but a very interesting one).

5th layer

- Source directivity probably should be a core descriptor, although several spatialisation techniques might be unable to take it into account.

> here some comments:
> Interestingly, out core descriptors are not present in all layers, they probably don't have to.
> I'm not sure if there should be grouping in Layer 5., but maybe there is an advantage for having grouping information for a clever encoding algorithm at the following layer.

- I am myself unsure if groupings belong at the 5th layer or rather at the 6th. From a stringency perspective they might belong on the 6th, but it might be that it would be convenient to permit them at the 5th layer instead (or as well).

3rd layer:

- Direct-to-one-speaker shouldn't be a core descriptor.

- Ircam Spat includes algorithms for conditioning the signals to each of the speakers (delay, gain adjustment, EQ). This kind of conditioning should be permitted as descriptors, and it might even be that they should be core. It might also be that this belongs on the 2nd layer rather than 3rd.


Does it make sense to reorganize SpatDIF according to layers? I personally feel that we would end up with a more structured document than the current proposal. How do the rest of you feel that attended the Berlin meeting?


Best,
Trond


_______________________________________________
Spatdif mailing list
[hidden email]
https://mail.bek.no/mailman/listinfo/spatdif
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [Spatdif] Organizing SpatDIF by layers

Trond Lossius
Administrator
In reply to this post by Jean Bresson
Dear Jean,

Thanks for keeping us in the loop with your work. I'm looking forward to be able to investigate it closer over the coming weeks. I have added the two papers to the list of papers here:

http://www.spatdif.org/papers.html

If there are other SpatDIF related papers that should be added, just let me know.

Best,
Trond


On Nov 25, 2011, at 4:04 PM, Jean Bresson wrote:

> Hi list,
>
> I though it could be of some interest to the list to post an update about the situation in OpenMusic / Spat scene description and communication.
>
>
>
> So far we proposed and implemented our spatial scene representation and encoding using the SDIF format:
>
> - Trajectories are stored as sampled data streams, and therefore we kind of make the assumption that any generative process and decision about sampling etc. is done prior to the file-writing stage.
> (It is possible however for the "reader" of this file to choose to apply post-processing, e.g. for sampling or smoothing).
> - SDIF embeds the concept of multiple interleaved (and independent) streams containing time-tagged frames (no sample rate) : We consider the different streams represents one source each (and therefore do not / can not really describe "hierarchical" scene graphs so far).
> - In the SDIF frames are stored the 3D cartesian coordinates of the source at time t (we can eventually use spherical or polar coordinates as well) as well as most of the spatialization parameters available in the Ircam Spatialisateur (which is for the moment the only known compatible renderer system) : aperture, orientation, perceptual parameters, etc.  [see the proposed/implemented type descriptiors here]
> - We use "NVTs" in the file header ("name-value tables") to store optional info (name, pathnames, etc.) about the different sources.
>
> OpenMusic can write such SDIF description files from its internal (matrix-like) sound scene representations (via OM-Spat library -- Ircam Forum).
> A command line version of Ircam's Spat (also distrib. Forum) has been developed by Thibaut Carpentier, which performs offline spatial rendering given a loudspeaker configuration and the SDIF scene description file (SDIF frames simply more or less behave as control messages which would be sent at time t to the Spat).
>
> We also developed a Max standalone application (Spat-SDIF-Player [*]) which uses MuBu buffering toolkit to load and control the streaming of the SDIF file data as OSC messages.
> These messages, as far as possible, comply with the SpatDIF specification. They can be received and decoded in principle by any OSC/SpatDIF compliant application.
>
> In the last Ircam Spat distribution is included a new object called "SpatDIF-to-Spat" allowing to convert incoming SpatDIF messages to relevant command messages spatoper or spat~.
>
> This work has been partly described in an ICMC paper available on the SpatDIF wiki or on this page of the repmus website.
> and all of it has been carried out in collaboration with Marlon Schumacher and Thibaut Carpentier.
>
>
>
> Best regards,
>
> Jean Bresson
>
> [*] Download links available here.
>
>
>
> Le 17 nov. 2011 à 14:47, Trond Lossius a écrit :
>
>> The sorting below also makes it clear to me that it would be relatively easy and fast to finalize a first specification for the 5th layer (scene description), so that we could develop a number of implementations in e.g. Jamoma, ICST-tools and Iosono software. From discussions in the meeting this also seemed like the layer where we might have the most instant gratification, if we would be able to achieve interoperability between Jamoma, ICST and Iosono (and possibly Ircam's Open Music).
>
> _______________________________________________
> Spatdif mailing list
> [hidden email]
> https://mail.bek.no/mailman/listinfo/spatdif

_______________________________________________
Spatdif mailing list
[hidden email]
https://mail.bek.no/mailman/listinfo/spatdif
Loading...