Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new ring source #181

Closed
wants to merge 2 commits into from
Closed

Conversation

leoyala
Copy link

@leoyala leoyala commented Aug 31, 2023

Created a new ring light source that is defined by specifying the inner and outer radius of the ring. To allow for a truncated ring geometry, the angular range of two halves of the same ring can be specified, this truncated geometry is common in medical optical devices such as laparoscopes.
Also updated .gitignore to ignore common configuration files from IDEs such as CLion, and compiler outputs.

@fangq
Copy link
Owner

fangq commented Aug 31, 2023

@leoyala, thanks for submitting this - however, I believe ring-shaped source is already supported under the disk source, see my commit in 2021

f0975c5

can you give it a try?

@leoyala
Copy link
Author

leoyala commented Sep 1, 2023

Hi @fangq , thanks for pointing me to the disk source. Even though it is True that you can specify a ring from that. As far as I can see, you cannot specify a truncated ring. This means where the ring is split into two parts like in the diagram below (just as an example). This geometry is something that we have worked on commonly in medical devices such as laparoscopes. What do you think if I include my changes to support this within the existing disk source?
truncated ring

@fangq
Copy link
Owner

fangq commented Sep 1, 2023

@leoyala, thanks for explaining this new source type.

I see it has new capabilities beyond the disk/ring source that already exists. so I am happy to accept your contribution, but I will need you to make some changes.

I will leave my comments in the related code in a second, but the first thing I want you to do is to do a git rebase and remove this spurious commit 14ac146

if you haven't done rebase before, check out this tutorial: https://www.educative.io/answers/what-are-git-rebase-and-cherry-pick

after rebase, the PR should only contain the relevant code changes (changing the least number of lines) to ensure other part of the codes were not impacted.

I am also wondering if you can specify which of the parameters are required and what happen if some additional parameters are not specified (thus have a 0 default value)? for example, what if someone set srcparam1.{yzw} to 0? what if one set srcparam1.{zw} to 0 and so on.

I recommend to name this ringsector or annulussector if your angle ranges are required. but if it can default to a ring shape if srcparam1.{zw} are not specified, then naming it as ring is fine.

once you finish the rebase, you should run make pretty to reformat your code to our default code style. this feature requires a tool called astyle, which can be installed on Linux systems or mingw64/cygwin etc.

@@ -1 +1,8 @@
Rakefile

# JetBrains configuration files
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if these are needed

@@ -93,6 +93,7 @@
#define MCX_SRC_PENCILARRAY 14 /**< a rectangular array of pencil beams */
#define MCX_SRC_PATTERN3D 15 /**< a 3D pattern source, starting from srcpos, srcparam1.{x,y,z} define the x/y/z dimensions */
#define MCX_SRC_HYPERBOLOID_GAUSSIAN 16 /**< Gaussian-beam with spot focus, scrparam1.{x,y,z} define beam waist, distance from source to focus, rayleigh range */
#define MCX_SRC_RING 17 /**< a ring for laparoscopic illumination */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please consider MCX_SRC_RINGSECTOR if the angle params are required

@@ -1385,6 +1385,60 @@ __device__ inline int launchnewphoton(MCXpos* p, MCXdir* v, Stokes* s, MCXtime*
canfocus = (gcfg->srctype == MCX_SRC_SLIT);
break;
}
case (MCX_SRC_RING): {
/* ring light source, can be a complete ring or a truncated ring. the expected parameters for definition of the
light source are as follows:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please format this using doxygen formatted comments, like the rest of the code, see example here
https://www.doxygen.nl/manual/docblocks.html

@dcuccia
Copy link

dcuccia commented Sep 1, 2023

Just listening in and have a minor thought/question, if it would be simpler to maintain/debug/test to have a single angular range, and then simulate the composite ring by superposing two independent solutions. Just a thought!

float random_angle = rand_uniform01(t);
float angle_range_1[2] = {((TWO_PI / 2.f) - cut_angle_1) / 2.f, ((TWO_PI / 2.f) + cut_angle_1) / 2.f};
float angle_range_2[2] = {(TWO_PI + (TWO_PI / 2.f) - cut_angle_2) / 2.f,
(TWO_PI + (TWO_PI / 2.f) + cut_angle_2) / 2.f};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep in mind that division is extremely expensive on the GPU - it is even more costly than sin/cos functions, so avoid using divisions at all cost.

for example, always convert a/2.f to a*0.5f - as the multiplication takes 1 clock cycle to finish and division takes 15 clock cycles or more to complete.

see
https://forums.developer.nvidia.com/t/speed-comparison-of-division-compared-to-other-arithmetic-operations-perhaps-something-like-clock-cycles/168371

CUDA even had to use approximated division __fdividef to speed up the computation
https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__SINGLE.html#group__CUDA__MATH__INTRINSIC__SINGLE_1gc996beec34f94f6376d0674a6860e107

float minimum_radius = gcfg->srcparam1.x;
float maximum_radius = gcfg->srcparam1.y;
float cut_angle_1 = (TWO_PI / 360.f) * gcfg->srcparam1.z;
float cut_angle_2 = (TWO_PI / 360.f) * gcfg->srcparam1.w;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you minimize the use of registers (temporary variables) in the new code? the current biggest bottleneck to mcx's GPU kernel is the number of registers - the total register used by mcx is ranging between 64 to 100+ (for more complex modes such as SVMC).

you can see other source definitions - we tried to minimize the use of temporary variables and compute temporary values on the fly or reuse some existing ones.

this new source alone added 24 new registers, which amounts to nearly half of the registers used by the barebone mcx kernel. I am pretty sure merging this code to the main code will significantly slow down the rest of the functions as the compiler is not smart enough to optimize out these registers.

Copy link
Owner

@fangq fangq Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that minimize the use of registers will hurt readability, but speed performance is the most important spec of mcx that we have to ensure

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me thinking if you can simply merge the angle range to the existing disk/ring source code? likely this will yield the same result but much more compact implementation. you can still call this ring source but just add a new case statement in this section

case (MCX_SRC_DISK):


float angle = gcfg->srcparam2.x; //full beam divergence angle measured at Full Width at Half Maximum (FWHM)
float FWHM = 2.f * tanf(0.5f * angle * TWO_PI / 360.f); //FWHM of beam divergence
float sigma = FWHM / (2.f * sqrtf(2.f * logf(2.f))); //standard deviation of gaussian with FWHM
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the numerator is a constant and should precompute the reciprocal and convert it to a single multiplication. otherwise, this line is very costly.

@@ -196,7 +196,7 @@ const char boundarydetflag[] = {'0', '1', '\0'};

const char* srctypeid[] = {"pencil", "isotropic", "cone", "gaussian", "planar",
"pattern", "fourier", "arcsine", "disk", "fourierx", "fourierx2d", "zgaussian",
"line", "slit", "pencilarray", "pattern3d", "hyperboloid", ""
"line", "slit", "pencilarray", "pattern3d", "hyperboloid", "ring", ""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few other places where these source type constants are defined, you will have to run cd mcx/src; grep 'zgaussian' -R * and modify all other units. I believe pmcx.cpp and mcxlab.cpp both need to be updated.

@fangq
Copy link
Owner

fangq commented Sep 1, 2023

Just listening in and have a minor thought/question, if it would be simpler to maintain/debug/test to have a single angular range, and then simulate the composite ring by superposing two independent solutions. Just a thought!

glad to hear from you @dcuccia, I am curious what you meant by a single angular range - like specifying the start angle only?

@fangq
Copy link
Owner

fangq commented Sep 1, 2023

@leoyala, upon reading the patch, I now see the biggest problem of merging this patch is the extensive use of registers (24) in your code. This will likely slow down other features - if we increase the register count from 64 to say 90, what will happen is that the GPU cores can only run 50% less simultaneous blocks because the register space on the core/SM is fixed. this will translate to 50% slowing down.

in the past, we found that CUDA compilers is not smart enough to reuse or remove inefficient registers at the compilation stage, so that's why we had to minimize its use from the beginning.

I suggest the best path moving forward is to incorporate your angular sector range parameters with the existing disk/ring source implementation and minimize the use of temporary variables.

let me kow if you want to take a look at the disk code and see if you can incorporate the angular parameters.

@dcuccia
Copy link

dcuccia commented Sep 1, 2023

glad to hear from you @dcuccia

Hey @fangq! Yes, long-time lurker :) Really impressed with the tools and community you have fostered over the years, and like to stay connected to your latest work. We've gotten some good mileage out of MCX for internal studies, and based on what Raj tells me, he really likes the new Python bindings.

I am curious what you meant by a single angular range - like specifying the start angle only?

In the discussion above "angle ranges" were discussed. And based on my superficial reading of the thread, was just curious if the design would be a single angle range (start/stop, default [0, 2Pi]), and then a multi-range composite system could be modeled by a superposition of two of these angle range sources by linearly combining/superposing the results. A multi-range source could be implemented by combining single-range sources as well, I suppose.

@fangq
Copy link
Owner

fangq commented Sep 3, 2023

We've gotten some good mileage out of MCX for internal studies, and based on what Raj tells me, he really likes the new Python bindings.

so glad to hear! yes, we are excited about the new python binding and would advertise it more.

In the discussion above "angle ranges" were discussed. And based on my superficial reading of the thread, was just curious if the design would be a single angle range (start/stop, default [0, 2Pi]), and then a multi-range composite system could be modeled by a superposition of two of these angle range sources by linearly combining/superposing the results. A multi-range source could be implemented by combining single-range sources as well, I suppose.

to be honest, I also did not fully follow the cut-angle calculations. I hope to get some clarifications from @leoyala. I totally agree with you that multiple ring sections should be achieved using a composite/superposition of this type of sources, rather than handling it inside a single source.

@fangq fangq closed this in ad6ff4c Sep 4, 2023
fangq added a commit that referenced this pull request Sep 4, 2023
compact implementation of ring source, close #181
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants